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Abstract  I 


There  is  a  tension  between  normative  and  descriptive  elements  in 
the  theory  of  rational  belief.  This  tension  has  been  reflected  in  work 
in  psychology  and  decision  theory,  as  well  as  in  philosophy.  Canons  of 
rationality  should  be  tailored  to  what  is  humanly  feasible.  But  rationality 
has  normative  content  as  well  as  descriptive  content. 

A  number  of  issues  related  to  both  deductive  and  inductive  logic  can 
be  raised.  Are  there  full  beliefs  —  statements  that  are  just  categorically 
accepted?  Should  statements  be  accepted  when  they  become  overwhelmingly 
probable?  What  is  the  structure  imposed  on  these  beliefs  by  rationality? 

Are  they  consistent?  Are  they  deductively  closed?  What  parameters,  if  any, 
does  rational  acceptance  depend  on?  How  can  accepted  statements  come  to 
be  rejected  on  new  evidence? 

Should  degrees  of  belief  satisfy  the  probability  calcixlus?  Does 
conformity  to  the  probability  calculus  exhaust  the  rational  constraints 
that  can  be  imposed  on  partial  beliefs?  With  the  acquisition  of  new 
evidence,  should  beliefs  change  in  accord  with  Bayes'  theorem?  Are 
decisions  made  in  accord  with  the  principle  of  maximizing  expected  utility? 
Shovild  they  be? 

A  systematic  set  of  answers  to  these  questions  is  developed  on  the 
basis  of  a  probabilistic  rule  of  acceptance  and  a  conception  of  interval¬ 
valued  logical  probability  according  to  which  probabilities  are  based  on 
known  frequencies.  This  leads  to  limited  deductive  closure,  a  demand  for 
only  limited  consistency,  and  the  rejection  of  Bayes'  theorem  as  universally 
applicable  to  changes  of  belief.  It  also  becomes  possible,  given  new  evidence 
to  reject  previously  accepted  statements. 
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1.  Introduction. 

The  Greek  philosophers  conceived  of  man  as  a  rational  animal.  This 
rationality  was  a  potentiality  that  might  or  might  not  be  actualized  under 
certain  circumstances.  (The  competence/performance  distinction  is  hardly 
new.)  Later  philosophers  have  sou^t  both  to  articulate  the  canons  of 
rationality,  and  to  apply  them  in  an  effort  to  understand  both  man  and 
his  world.  In  the  course  of  Western  philosophy,  it  has  become  ever  clearer 
that  the  notion  of  rationality  itself  is  problematic. 

Hume  divided  the  objects  of  knowledge  into  matters  of  fact  and 
relations  of  ideas.  Canons  of  rationality  concern  relations  of  ideas. 

In  modern  terms,  these  canons  concern  the  logical  relations  among  sentences 
or  propositions.  Whitehead  and  Riossell,  in  their  monimental  treatise  on 
mathematical  logic,  Principia  Mathematica,  (1959»  1910)  attempted  to 
provide  a  complete  chairacterization  of  these  canons.  No  sooner  had  they 
done  so  than  objections  arose.  These  objections  fell  into  two  groups: 

First,  it  wais  objected  that  many  of  the  inferences  allegedly  licensed  by 
the  logical  framework  of  the  Principia  were  intuitively  invalid;  for  example, 

from  the  denial  of  "If  Road-Rvinner  wins  the  fifth  race.  Speedy  will  not  win 

might  be  alleged  to  follow 

the  third,"  it/  both  that  Road-Runner  wins  the  fifth  and  that  Speedy 

wins  the  third.  The  canonical  lise  of  "o",  "a">  etc.,  does  not 

precisely  reflect  the  use  of  the  corresponding  English  connectives.  Second, 
it  was  objected  that  many  intuitively  valid  inferences  in  ordinary  language 
could  not  be  captxired  in  the  formalism  of  Principia  Mathematica,  and  this  has 
led  philosophical  logicians  to  devise  a  plethora  of  modal,  intensional,  causal, 
and  deontic  logics. 

Hume  also  emphasized  the  gulf  between  "ought"  and  "is."  This  comes 


to  us  as  the  injimction  not  to  confuse  the  normative  and  descriptive. 
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The  constraints  on  our  beliefs  —  on  the  relations  of  our  ideas  —  imposed 
by  logic  are  intended  to  be  a  priori  eind  normative,  despite  the  fact  that 
in  designing  those  constraints  we  are  guided  by  intuition  and  ordinary 
usage.  But  if  we  are  going  to  be  guided  by  ordinary  usage,  it  behooves  us 
to  find  out  what  that  ordinary  usage  is  —  clearly  a  task  for  empirical 
investigation  rather  than  for  armchair  speculation.  In  recent  years  a 
number  of  psychologists  ( Johnson-Laird  (1977),  Henle  (1962),  Wason  (1977)) 
have  explored  the  inferential  propensities  of  human  subjects,  and  have 
discovered  a  considerable  gap  between  the  canons  of  rationality  as  codified 
in  logic  texts,  and  the  ways  in  which  ordinary  people  reason. 

Does  this  show  that  people  are  not  rational?  Or  that  if  they  are 
potentially  rational,  they  too  rarely  actualize  that  potentiality?  Or 
that  their  performance  falls  short  of  their  competence?  Or  does  it  show 
that  logicians  have  formulated  canons  that  do  not,  after  all,  capture  the 
essence  of  human  rationality? 

Hume  himself  held  few  doubts  about  the  natxire  of  deductive  relations. 
But  he  emphasized,  more  sharply  than  any  of  his  predecessors,  the  difficulty 
of  finding  rational  constraints  for  nondeductive  argument.  Scientific 
inference,  learning  from  experience,  probable  argument,  all  escape  the  net 
of  deductive  rational  constraints.  John  Maynard  Keynes (1952,  1921)  and 
Rudolf  Carnap  (1950)  were  among  those  to  propose  that  there  was  a  logical 
notion  of  probability  that  could  be  called  on  to  provide  rational  constraints 
for  nondeductive  argument  and  inference.  Such  a  notion  would  also  provide 
a  framework  for  decision  theory.  The  program  of  finding 

rational  constraints  on  nondeductive  inference  in  the  calculus  of  probability 
has  not  been  a  success.  With  regard  to  scientific  inference,  induction,  and 
probable  argument,  many  philosophers  now  argue  that  the  quest  for  rational 
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constraints  is  misgnided.  Rather,  we  shoxild  look  for  historical.,  sociological, 
and  psychological  accounts  of  why  people  accept  the  arguments  they  do,  make 
the  inferences  they  do,  believe  the  scientific  theories  aind  hypotheses  they 
believe. 

Probability  plays  a  large  role  in  philosophical  discussions  of  decision 
making,  choice,  and  belief,  but  the  interpretation  of  probability  most 
commonly  employed  is  subjective:  probability  Just  degree  of  belief. 

There  is  still  a  normative  element:  degrees  of  belief  ought  to  satisfy 
the  probability  cadculus.  But  in  point  of  fact,  this  constraint  is  a  p\irely 
deductive  one.  Combined  with  a  behavioral  interpretation  of  belief,  it 
says  roughly  that  you  shouldn't  be  prepared  to  make  a  set  of  bets  such  that 
a  wily  opponent  can  be  svire  of  taking  from  you  what  you  value. 

But  again,  if  we  suppose  that  probabilities  reflect  beliefs,  and  that 
behavior  is  a  way  of  getting  at  beliefs,  it  becomes  an  empirical  question 
■vrtiether  people  behave  in  the  ways  in  which  philosophers  regard  as  rational. 
Psychologists  have  examined  choice  behavior,  decision  making,  and  the 
behavioral  manifestations  of  degrees  of  belief  (Edwards  (195^),  Tversky 
and  Kahneman  (197^),  Nisbett  aind  Ross  (I98O)),  and  discovered  that  people 
do  not  choose,  decide,  and  believe  as  philosophical  canons  of  rationality 
suggest  they  ought. 

Again  we  are  faced  with  a  problem.  Are  the  canons  of  rationality 
embodied  in  ordinary  decision  theory  wrong?  Or  inappropriate  for  human 
beings?  Or  are  people  mostly  irrational?  Is  there  some  way  of  adjusting 
the  canons  of  rationality,  or  reinterpreting  the  actualities  of  behavior, 
so  that  the  gap  is  not  so  great  between  what  is  and  what  ought  to  be?  Or 
some  way  of  modifying  behavior  to  that  same  end? 
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These  questions,  and  those  raised  previously,  suggest  certain  prior 
questions.  What  is  it  that  we  want  of  a  normative  theory  of  rational 
belief?  What  sort  of  framework  of  terms  and  ideas  should  such  a  theory  be 
placed  in?  What  relation  should  we  expect  to  find  between  a  normative  and 
descriptive  theory  of  inference  and  choice?  In  the  sections  that  follow, 

I  shall  try  to  provide  an  epistemological  framework  in  which  to  seek  answers 
to  both  normative  and  descriptive  questions  about  belief,  inference,  and 
choice. 

2.  Methodology. 

There  are  a  number  of  possible  sources  of  principles  of  rationality. 

In  traditional  philosophy,  the  source  has  often  been  taken  to  be  rational 
intuition  —  a  faculty  common  to  all  men  in  virtue  of  their  humanity,  since 
men  are  rational  animals.  Even  if  there  is  such  a  facility  (which  seems 
doubtful)  the  recent  debates  and  disagreements  concerning  the  nature  and 
extent  of  constraints  on  rational  belief  show  that  it  does  not  provide  a 
univocal  standard  that  can  lead  scholars  to  agreement. 

L.  J.  Cohen  (198I)  suggests  that  the  source  of  our  standards  of  rationality 
is  intuition  —  the  untutored  intuition  of  ordinary  educated  people  — 
subject  to  the  constraint  of  consistency.  We  should  make  the  minimum  modifi¬ 
cation  in  the  deductive  intuitions  of  the  ordinary  citizen  to  render  those 
intuitions  consistent.  (Since  consistency  is  itself  a  notion  of  deductive 
logic,  the  constraint  seems  either  vacuous  or  question-begging.)  This 
suggests  that  we  should  begin  with  an  empirical  inquiry  into  people's  logical 
intuitions.  But  it  is  not  clear  that  such  an  inquiry  would  be  any  more 
relevant  to  the  development  of  normative  standards  of  (inductive  or  deductive) 
logical  cogency,  than  an  inquiry  into  people's  airithmetical  intuitions  would 
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be  to  the  development  of  standards  of  arithmetical  validity. 

Stich  and  Nisbett  (198O)  also  point  to  intuition  as  the  grounds  of 
Justification  in  human  reasoning,  but  suggests  that  it  is  the  intuition 
of  "experts"  to  which  we  should  turn.  They  admit  that  "it  is  to  be  expected 
that  there  will  be  some  disputes  over  Justification  that  admit  of  no 
rational  resolution,"  [p.  202]  since  my  expert  may  be  your  crackpot.  But 
this  is  precisely  one  of  the  things  we  would  like  a  theory  of  rational 
belief  to  provide:  a  standard  for  sorting  expert  sheep  from  crackpot  goats. 
Einhorn  and  Hogarth  (1981)  refer  to  "the  inescapable  role  of  intuitive 
Judgment  in  decision  making."  (p.  6I)  Brian  Ellis  (19T9)  argues  that  the 
laws  of  belief  are  the  laws  of  thought,  though  the  laws  of  thought  that 
interest  us  are  the  laws  of  ideal  thought  —  so  to  speak,  the  laws  of 
frictionless  thought,  by  analogy  with  the  laws  of  frictionless  billiard 
balls.  These  laws  are  discoverable  by  introspection,  which  I  take  to  be 
roughly  the  same  as  intuition.  But  Ellis  also  argues  that  this  is  not  true 
of  the  dynamic  laws  of  belief  —  the  laws  of  changes  of  belief . 

On  the  psychological  side,  there  is  a  wide  spectrum  of  empirical 
studies.  Johnson-Laird  (1977)  contrasts  the  standard  logical  use  of  the 
truth-functional  connectives  and  quantifiers  with  the  use  of  the  (allegedly) 
corresponding  English  constructions,  and  finds  wide  discrepancies.  Kahneman 
and  Tverpky  (1973)  claim  to  show  that  people  are  often  not  "rational"  in 
their  assessments  of  probability.  Slovic,  Fishhoff,  and  Lichentenstein  (1977) 
claim  that  "people  systematically  violate  the  principles  of  rational  decision 
making."  Mynatt,  Doherty,  and  Tweeny  (1977)  have  investigated  "confirmation 
bias"  in  the  assessment  of  scientific  evidence.  Lyon  sind  Slovic  (1976) 
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claim  that  people  often  fail  to  take  account  of  base  rates  in  mak i ng 
probability  assessments . 

There  are  several  difficulties  that  stand  in  the  way  of  taking  these 
investigations  to  be  Immediately  relevant  to  the  investigation  of  canons 
of  rationality.  First,  there  is  the  problem  of  translation:  "if  P  then 
in  English  may  have  a  truth-functional,  meaning,  but  it  is  more  likely 
to  mean  one  of  (a)  ^  is  derivable  from  (b)  ^  is  derivable  from  ^  together 
with  some  other  things  I  know,  (c)  P,  together  with  other  things  I  know, 
makes  it  probable  that  £,  (d)  P,  together  with  other  things  that  I  take 
experts  to  know,  would  render  £  very  probable.  Perhaps  there  are  other 
candidates  as  well.  The  point  is  that  the  standards  of  probabilistic  and 
deductive  cogency  that  are  considered  are  abstract  and  formal;  the  material 
of  an  experimental  investigation  is  necessarily  concrete  and  framed  in  ordinary 
rather  than  formal  discourse. 

Another  difficulty,  emphasized  by  Einhorn  and  Hogarth  (I981)  in  their 
brilliant  review  of  psychological  decision  theory,  is  that  the  standards 
of  rationality  themselves  are  in  dispute,  so  that  it  is  unclear,  when  the 
intuitions  of  experimental  subjects  disagree  with  the  intuitions  of  the 
experimenter,  whether  it  is  the  experimenter  or  the  subject  who  ought  to 
reform  his  ideas  of  rationality. 

Nevertheless,  there  are  some  clear  cases  —  among  them  some  of  those 
investigated  by  Kahneman  and  Tversky  —  in  which  the  subject  himself  seems 
likely  to  agree  that  he  has  made  a  "mistake."  I  have  in  mind  particularly 
the  example  (Tversky  and  Kahneman  (I98I),  p.  1*5^)  in  which  a  subject  prefers 
a  sure  gain  of  $2^0  to  a  25^  chance  to  gain  $1000  and  a  75^  chance  to  gain 
nothing,  but  also  prefers  a  75^  chance  to  lose  $1000  and  a  25^  chance  to 
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lose  nothing,  to  the  alternative  of  a  sure  loss  of  When  the  two 

decisions  are  explicitly  combined  to  yield  a  choice  between  a  25%  chance 
to  win  $2l+0  and  a  75/5  chance  to  lose  $7^0,  against  a  25%  chance  to  win 
$250  and  a  75/5  chance  to  lose  $750,  the  second  alternative  is  chosen  by 
100/5  of  the  subjects. 

It  seems,  then,  that  some  intuitions  are  pretty  dependable  aind  pretty 
universal.  Nevertheless,  in  pvirsuing  the  consequences  of  even  very  simple 
intuitions,  which  constitute  our  starting  point,  we  must  be  prepared  to 
reexamine  them  at  any  Juncture. 

There  is  another  respect  in  which  intuition  provides  a  starting  point. 

In  developing  a  theory,  whether  it  is  a  normative  one  or  a  descriptive  one, 
we  must  choose  a  representation  for  the  domain  with  which  we  are  concerned: 
the  theory  will  concern  certain  objects,  and  certain  relations  among  objects. 
We  intend  the  theory,  whether  it  is  normative  or  descriptive,  to  accoiont  for 
or  to  influence  a  certain  realm  of  experience.  It  may  be  that  our  choice 
of  objects  and  relations  is  a  poor  one;  that  no  theory  framed  in  those 
terms  can  account  for  the  realm  of  experience  at  issue,  or  that  there  is 
no  way  of  applying  a  normative  theory  framed  in  those  terms.  Under  these 
circumstances,  the  theory  is  not  false,  the  standards  not "incorrect"  —  it 
is  rather  the  case  that  the  theory  is  simply  ill  formed. 

The  method  that  I  shall  follow,  then,  is  primarily  philosophical. 

I  shall  propose  certain  objects  and  certain  relations  as  the  ingredients 
of  a  theory  of  rational  belief.  I  shall,  on  the  basis  of  elementary 
intuitions,  claim  that  certain  of  these  relations  actually  hold  of  ideal 
beliefs,  and  that  the  resulting  theory  provides  a  rough  approximation  to 
actual  human  bodies  of  belief,  and  that,  in  addition,  it  can  function 
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normatively,  as  what  Ellis  calls  a  "regulative  ideal."  For  a  regulative 
ideal,  we  require  a  theory  tha-c  gives  us  a  standard  for  the  criticism 
and  improvement  of  our  actual  beliefs  (the  suggestion  that  we  believe 
only  what  is  true  does  not  give  such  a  standard);  but  we  also  require  a 
theory  that  goes  beyond  what  we  —  or  even  the  experts  —  actually  do.  It 
does  no  harm  to  have  an  ideal  that  can  only  be  approached. 

At  the  same  Lime,  empirical  data  are  relevant.  There  is  a  wealth 
of  data  purporting  to  show  that  normative  inference  theory  and  normative 
decision  theory  are  systematically  and  pervasively  violated  by  actual 
h\nnan  inference  and  decision  making.  Certain  of  these  studies  will  be 
reviewed,  and  it  will  be  siiggested  that  in  some  cases  and  in  some  degree 
the  normative  theory  sketched  here  is  not  so  often  or  so  flagrantly 
violated.  This  will  be  taken  to  show  that  a  restructuring  and  modification 
of  the  classical  normative  theory  of  rational  belief  may  lead  to  new 
questions  and  new  research  in  psychological  theory,  as  well  perhaps,  as  to 
new  approaches  to  the  improvement  of  human  performance. 

There  is  another  way  in  which  empirical  data  can  provide  a  "test"  of 
a  normative  theory.  We  do  not  expect  our  subjects  to  conform  completely 
to  the  normative  theory.  But  when  they  fall  short,  we  expect  them  to  fall 
short  in  understandable  ways.  We  expect  normal  adults,  to  whom  our 
standsLrds  of  rationality  are  applicable,  to  know  the  product  of  h  and  7; 
we  do  not  expect  them  to  know  the  product  of  5693*+  and  *+5927 .  For  much 
the  same  reasons,  we  expect  them  to  know  the  probability  of  a  pair  of 
ones  resulting  from  the  throw  of  a  pair  of  dice,  but  we  don't  expect  them 
to  know  the  probability  that  two  people  in  a  group  of  25  will  have  the  same 
birthday.  Nor  do  we  expect  them  to  be  altogether  accurate  in  intuitive 
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statistical  inference. 

Since  our  focus  is /question  of  belief  and  vhat  makes  it  rational, 
ve  leave  to  one  side  all  the  very  complex  questions  concerning  utility 
that  are  involved  in  general  decision  theory.  Ve  shall  be  concerned  with 
decisions  only  insofar  as  they  throw  light  on  actual  beliefs. 

Oxir  procedure  is  thus  intuitive:  we  shall  seek  principles  of 
rationality  that  are  intuitively  valid  in  simple  cases,  and  extend  them 
to  take  account  also  of  more  complicated  cases .  At  the  same  time  we  will 
look  to  empirical  data  to  check  our  intuitions  —  and  also  from  another 
slant:  Are  failures  to  conform  to  the  canons  of  rationality  understandable 
in  terms  of  natural  human  limitations? 

3.  The  Terms  and  Scope  of  the  Theory. 

The  objects  of  belief,  those  things  to  which  an  individual  stands  in 

a  certain  relation  when  he  has  beliefs,  have  been  variously  taken  to  be 

propositions,  facts,  states  of  affairs,  sentences  in  mentalese,  sentences 

in  the  ordinary  language  of  the  individual.  I  shall  take  them  to  be 

sentences  in  our  language,  but  not  our  ordinary  language.  Rather,  I  shaJ.1 

suppose  that  they  are  sentences  in  <±li  extensional  first  order  logic,  with 

the  standard  sentential  connectives  and  quantifiers.  This  generates  a 

someone 

cheillenge  of  interpretation.  If  /  says,  or  acts  as  if  he  believes,  that 
Rover  is  a  dog,  we  can  represent  what  he  believes  as  the  sentence  ”dog(Rover), 
or  "D(r)."  But  if  someone  says  that  John  will  get  five  dollars  if  he 
cuts  the  lawn,  it  is  only  when  the  speaker  is  a  (slightly  perverse)  logician 
that  we  can  sensibly  represent  his  assertion  as  "C(j)^F(j)." 

The  burden  of  interpretation  is  nontrivial,  particularly  if,  as  I 
suspect,  many  conditionals  are  best  interpreted  metalinguistically,  and  not 
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by  any  sentence  of  our  object  language  at  all.  CThe  claim  that  if  P 
then  ^  is  then  intei^ireted  along  the  lines:  if  sentence  P  is  added  to  my 
body  of  knowledge  (or  to  that  of  any  sensible  citizen),  then  the  sentence 
will  also  be  a  part  of  that  body  of  knowledge.)  But  there  are  also 
significsint  benefits  conferred  bythis  move.  We  can  characterize  a 
demonstrably  sound  notion  of  logical  vadidity.  We  can  give  a  syntactical 
notion  of  proof  from  premises  and  of  theoremhood.  If  you  claim  that  A 
follows  from  and  I  am  skeptical,  then  if  we  can  agree  on  translations 
of  A  and  B,  T( A)  and  ^(B^)  ,  the  issue  can  be  resolved  by  a  proof  or  a  counter¬ 
example.  And  if  we  can't  agree  on  translations  of  A  and  B^,  that  in  itself 
may  help  to  show  us  where  ovir  differences  lie. 

Not  everything  we  believe  can  be  represented  in  a  standard  first 
order  language.  Certainly  not  everything  that  is  believed  can  be  repre¬ 
sented  in  a  first  order  language  that  we  can  understand;  We  can  no  more 
adequately  represent  a  dog's  beliefs  in  our  regimented  first  order  language 
than  a  treatise  on  painting  can  be  translated  into  a  language  lacking 
color  words .  Our  language  is  too  poverty-stricken  when  it  comes  to  odor 
words  to  do  justice  to  the  dog's  beliefs.  But  there  is  still  a  wide  range 
of  things  believed  that  can  be  represented  in  the  way  suggested,  and  a 
theory  of  rational  belief,  even  if  it  were  limited  in  scope  to  those  things, 
would  be  interesting  and  useful. 

In  sum:  we  take  objects  of  belief  to  be  sentences  in  an  extensionaJL 
first  order  logic,  with  operations  and  identity,  that  includes  the  relation 
'e'  (is  a  member  of),  and  axioms  for  set  theory. 

In  that  language  we  have  formal  notions  of  deductive  consequence  we 
write  ^  is  a  deductive  consequence  of  as  {Pj^,...,P  }  H  ^) » 
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consistency  (we  write  Consis(^)  for  "the  set  of  sentences  is  consistent), 
and  theoremhood  (we  write  H  C  for  "C  is  a  theorem").  Note  that  these  are 

purely  syntactical  notions;  they  rest  on  a  syntactical  notion  of 

provability,  rather  than  on  a  semantical  notion  of  entailment. 

There  is  a  fundamental  ambiguity  in  the  notion  of  belief.  We  speak 

both  of  degrees  of  belief,  and  of  belief  simpliciter.  When  a  coin  is 

tossed,  I  believe  it  will  come  to  rest  (rather  than  going  into  orbit  or 
disappearing),  and  I  have  a  degree  of  belief  of  about  a  half  that  it  will 
come  to  rest  with  the  heads  uppermost.  One  might  try  to  collapse  these 
notions:  to  construe  full  belief,  or  belief  simpliciter,  as  belief  of 
the  highest  degree  or  belief  of  degree  1  (Jeffrey  19^5  ).  Alternatively, 
we  may  construe  belief  simpliciter  as  acceptance  into  a  set  of  statements 
that  constitutes  a  body  of  knowledge  or  a  rationeil  corpus  (Levi  I98O  ). 

This  is  not  the  place  to  review  the  philosophical  pros  and  cons  of  the 
two  approaches  to  full  belief.  Perhaps  the  most  penetrating  discussion  is 
in  Levi  (I980).  I  shall  simply  adopt  the  second  approach. 

Given  that  we  adopt  the  second  approach,  there  are  a  n\amber  of  questions 
we  shoiild  expect  a  theory  of  rational  belief  to  answer.  What  is  the  structure 
of  the  set  of  statements  constituting  a  rational  corpus?  How  do  statements 
get  into  a  rational  corpus?  A  less  frequently  addressed,  but  equally 
important  question:  How  do  statements  get  expunged  from  a  rational  corpus? 

Is  what  is  taken  to  belong  to  a  rational  corpus  dependent  on  context,  and 
if  so,  in  what  way? 

Degrees  of  belief  are  typically  associated  with  probabilities.  There 
are  a  large  number  of  interpretations  of  probability  available,  including 
finite  and  limiting  frequency  interpretations  (Russell  (19^8),  von  Mises  (1957), 
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Reichenbach  (19^9) )>  various  forms  of  propensity  interpretations  (Popper 
{l957)^  Mellor  (1971)),  logical  range  interpretations  (Carnap  (1950), 

Hintikka  (I965)),  subjectivist  or  personalist  interpretations  with  varying 
degrees  of  normative  force  (Savage  (195*+),  de  Finetti  (198O),  Jeffrey  (1965)) 
as  well  as  a  number  of  nonstandard  views  represented  by  Cohen’s  "Baconian" 
probability  (Cohen  (1977)),  Popper's  degree  of  corroboration  (Popper  (1959)), 
and  various  notions  of  "degree  of  factual  support"  (Hempel  and  Oppenheim 
Cl9**5).',  which  do  not  satisfy  the  usual  probability  axioms. 

I  shall  relate  both  degrees  of  belief  and  the  grounds  of  acceptance 
into  a  rational  corpus  to  my  own  interpretation  of  probability  (Kyburg  (1961), 
Cl97*+))-  This  interpretation  is  syntacticsLi,  and  probability  is  construed 
as  a  syntactical  relation  between  a  sentence  and  a  set  of  sentences  construed 
as  a  rational  corpus.  Given  a  sentence  P,  and  a  set  of  sentences  S^,  if 
S_  meets  certain  minimal  requirements,  the  probability  of  P  relative  to  S_ 
will  be  a  closed  subinterval  {£,£]  of  the  interval  [0,lj. 

An  example  may  help  to  clarify  the  definition  that  follows.  Suppose 

that  P^  is  the  sentence  "John  will  go  to  the  movies  tonight,"  and  that  S_ 

is  the  set  of  sentences  that  represent  my  reasonable  or  Justified  beliefs 

this  afternoon.  The  probability,  relative  to  my  body  of  knowledge  S_,  that 

John  will  go  to  the  movies  tonight  if  the  interval  I.6,.T]  under  these 

circumstances;  first,  represents  a  set  of  reasonable  beliefs.  Second,  I 

know  that  John  is  going  to  decide  whether  or  not  to  go  to  the  movies  by 

drawing  a  chip  from  an  urn,  and  that  he  will  go  if  and  only  if  the  chip  he 

draws  is  black.  If  we  represent  the  proper  description  of  that  chip  by  c_ 

G 

and  the  set  of  black  objects  by  b,  the  sentence  "P  ^  6— b "  is  among  the 
sentences  representing  my  reasonable  beliefs.  Third,  I  know  that  the  chip 
c_  is  a  member  of  the  set  of  chips  in  the  urn;  representing  this  set  by  u, 
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the  sentence  "c_  e  u"  also  belongs  to  S_.  Fourth,  what  I  know  about  the 
set  of  chips  u  is  that  between  6q%  and  7055  are  black,  and  I  don't  know 
anything  more  exact  than  that.  The  sentence  expressing  this  fact  will  be 
written  "/5(u,b)  =  [.6, .7]";  ve  Oppose  that  "55(u,b)  t  [.6, .7]"  is  a  member  of 
Fifth,  relative  to  what  I  know  —  ^  —  the  chip  c_  must  be  a  random  member 
of  the  set  of  chips  u  with  respect  to  being  black.  This  is  taken  to  require, 
not  that  I  have  some  special  knowledge  about  how  c_  is  selected,  but  that 
there  are  no  ingredients  in  which  would  lead  me  to  a  conflicting  probability 
For  example,  if  I  knew  that  John  wanted  to  go  to  the  movies  very  badly 
and  that  he  was  likely  to  peek  at  a  chip  before  he  "selected"  it,  I  would 
have  grounds  for  denying  that  c_,  the  chip  selected  by jjohi^was  a  random 
member  of  u  with  respect  to  being  black. 

The  definition  of  probability  is  roughly  the  following:  The  probability 
of  the  sentence  P,  relative  to  the  set  of  sentences  S_,  is  the  closed 
intervsil  [£,£j  ,Prob{P,S)  =  [£*3.1 »  just  in  case  there  are  terms  x_,3r,z_  such 
that:^ 

(l)  is  a  rational  coitus.  This  requires  that  satisfy  certain  syntactical 
constraints  that  will  be  discussed  shortly. 

C2)  "P  =  X  e  is  a  sentence  in  £. 

(3)  "x  e  is  a  sentence  of  S_. 

C^)  "/5(£.,z.)  e  l£»£l"  is  a  sentence  of  S. 

This  last  is  a  sentence  saying  that  the  proportion  of  ^'s  that  are  2_' s 
lies  in  the  intervaLl  [£,£].  It  may  also  be  interpreted  in  terms  of  relative 
frequency,  or  limiting  frequency,  or,  most  generally,  measxire.  Levi,  whose 
interpretation  of  probability  is  quite  close  to  mine,  requires  that  it  be 
a  statement  of  chance  (Levi  (1980),  p.  251);  if  we  so  interpret  it,  we  must 


construe  it  nonextensionally . 
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(5)  X  is  a  random  member  of  jr  with  respect  to  z,  relative  to  the  set  of 
sentences 

This  last  condition  is  also  to  be  spelled  out  syntactically.  It  is 
equivalent  to  the  assertion  that,  relative  to  S,  ^  is  an  appropriate 
reference  class  for  the  question  of  whether  x  belongs  to  To  give  a 
flavor  of  the  syntactical  constraints  embodied  in  condition  (5),  we  may 
note  that  if  "x  e  jr'",  C  x"»  "!5(Z' e  [r,^]”  belong  to  S, 

where  [r^,s_]  is  different  from  ,  then  we  will  want  to  say  that  condition 

(5)  is  not  satisfied.  That  is,  we  will  deny  (5)  if  includes  the  knowledge 
that  X  is  a  member  of  some  subset  of  2.  in  which  the  relative  frequency 
of  ^'s  differs  from  that  represented  by  (U). 

We  may  illustrate  this  special  case  by  reference  to  the  previous  example. 
Suppose  my  body  of  knowledge  or  rational  beliefs,  S.,  is  expanded  by  the 
addition  of  sentences  representing  the  knowledge  that  there  are  two  kinds 
of  chips  in  the  urn,  round  ones  and  square  ones;  that  in  deciding  whether  or 
not  to  go  to  the  movies  John  always  chooses  (deliberately)  a  round  chip; 
and  that  between  Vj>%  and  20%  of  the  round  chips  are  black.  As  before,  c_ 
is  the  chip  that  ^jbhn  will  choose,  b  is  the  set  of  black  objects,  u  the 
set  of  chips  in  the  urn.  But  now  we  must  consider  the  partition  of 
into  two  subsets:  ru  consisting  of  round  chips  and  su  consisting  of  square 
chips .  We  have 
c_  e  ru 
rue  u 

%Cru,b)  c  [.15, .20] 

all  in  £,  and  therefore  we  will  deny  that  c_  is  a  random  member  of  u 
with  respect  to  relative  to  the  expanded  S_.  But  we  will  now  be  able  to 
claim  that  c_  is  a  random  member  of  ru  with  respect  to  and  therefore 
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that  the  probability  that  John  will  go  to  the  movies  is  the  interval 

[.15, .20]. 

It  is  also  necessary  to  impose  constraints  on  the  terms  that  can  occupy 
the  places  of  and  ^  in  the  definition.  All  of  these  matters  are  somewhat 
difficult,  but  they  do  not  bear  directly  on  the  issues  at  hand.  What  I  am 
supposing,  and  claim  to  have  shown  elsewhere,  is  that  it  is  possible  to 
give  rules  that  pick  out  a  reference  class  for  a  sentence  P^,  given  a  set 
of  sentences  representing  a  rational  corpus. 

The  notion  of  probability  that  I  have  sketched  has  the  following 
properties : 

CP-1)  If  S_  is  a  rational  corpus,  and  P  and  ^  are  known  in  S_  to  have  the 
same  truth  value  (i.e.,  if  "P  ^  £  £  §_) ,  then  Prob(P,S)  =  Prob(Q,S) . 

In  the  modified  example ,  since  we  still  know  in  S_  that  John  will  go  to  the 
movies  if  and  only  if  he  draws  a  black  chip,  we  have: 

ProbC’John  will  go  to  the  movies",^)  = 

ProbC'John  will  draw  a  black  chip'',S)  =  [.15,. 20]. 

(P-2)  If  S_  is  a  rational  corpus  and  ^  e  S,  then  Prob(P,S)  =  [l,l]  and 
Prob(~P,S)  =  [0,0]. 

In  the  modified  example,  since  "rue  is  in  S^,  Prob(  "ru  C  u",£)  =  [l,l]. 
Since'’c_  is  round"  is  in  Prob ( "c  is  not  roxind",£)  =  [0,0]. 

(P-3)  Every  probability  is  based  on  a  relative  frequency  known  in  £  (or 
measure,  or  chance,  or  limiting  frequency). 

In  the  initial  version  of  the  example,  the  probability  that  John  will  go  to 
the  movies  is  based  on  the  knowledge  that  ^(u,^)  e  [.6, .7];  in  the  modified 
example  it  is  based  on  the  knowledge  that  "(ru,b)  e  [.15, .20]. 
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(P-1+)  If  S  consists  of  a  set  of  true  sentences  of  the  form  "?(y,z.)  e 

and  a  sentence  "x  e  jr,"  the  definition  of  probability  is 
equivalent  to  a  frequency  definition  correspondingly  restricted  in 
scope. 

Suppose  that  "all  ve  know"  about  c_  is  that  it  is  a  chip  in  the  urn,  but  we 

know  of  the  set  u  of  chips  in  the  urn  that  there  are  four  kinds,  k^,  k^, 

k^,  and  that  the  relative  frequency  of  each  kind  is  1/8,  1/1+ ,  l/l6, 

9/16  =  1  -  1/8  -  1/1+  -  1/16.  The  probability  that  £  belongs  to  any  kind 

(or  any  Boolean  combination  of  kinds)  is  the  corresponding  frequency. 

CP-5)  If  S_  is  a  rational  corpus  and  T  a  finite  set  of  sentences,  there 

exists  a  function  ^  (a  belief  function)  such  that  B  satisfies  the  - 
axioms  of  the  conventional  probability  calculus ,  and  for  every 
sentence  P  in  T,  BCP)  e  Prob(P,S) 

Suppose  that  t  =  and  Prob(tj^,S)  =  [£^,£2],  Prob(t^.S)  = 

Prob(t^,S)  =  l£2»£2]*  Then  there  exists  8ui  additive  real -valued 
function  B  such  that  B(t^)  e  and  E(t^)  e  [£2»3.3]- 

(P-6)  If  ^  is  a  rational  corpus,  and  P  is  derivable  from  £,  then  the 

lower  probability  of  P  is  at  least  as  great  as  the  lower  probability 
of 

In  the  initial  example,  the  probability  that  John  will  go  out  somewhere 
tonight  is  an  interval  whose  lower  bound  is  at  least  .6. 

(P-T)  The  principle  of  epistemic  conditionalization  —  that  if  P  is  added 

to  ^  to  yield  a  new  rational  corpus  S^'  ,  then  the  probability  of 

relative  to  S^'  should  be  the  ratio  of  the  probability  of  "P  and 

to  the  probability  of  £  (where  "ratio"  is  suitably  understood)  does 

2 

not  generally  hold. 
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(P-8)  If  the  probability  of  £  relative  to  £  is  [d.,3^]  ,  the  probability  of 
<*£  relative  to  S_  is  [l-q,l-p]. 

In  our  original  example ,  the  probability  that  John  will  go  to  the  movies 
is  [.6, .7]"  The  probability  that  he  will  not  go  to  the  movies  is  [-3,.^]  = 
Ii-.7,i--6]. 

The  particular  framework  that  I  discuss  here  is  not  the  only  one  of  its 
kind.  There  are  other,  similar,  ways  of  approaching  these  problems  — 
for  example,  Levi  (I98O)  develops  a  view  of  rationality  that  is  epistemic 
in  character,  and  depends  even  more  on  the  notion  of  commitment,  but  which 
is  distinguished  from  the  present  account  both  by  the  importance  of  chance 
in  his  treatment  of  epistemic  probability,  and  by  a  more  thorough  pragmatic 
orientation.  On  Levi's  view  the  principle  of  epistemic  or  confirmational 
conditionalization  does  hold. 

h.  The  Principles  of  Pull  Belief  or  Acceptance. 

What  is  it  to  accept  a  sentence,  or  to  awaurd  full  belief  to  it?  We 
might  construe  belief  in  this  sense  as  occurrent  belief:  one  is  fully 
believing  or  accepting  £  only  if  one  is  thinking  of  it,  and  thinking  of  it 
in  a  certain  way.  But  this  woxild  not  lead  us  to  a  very  interesting  normative 
theory.  We  might  construe  belief  dispositionally :  X  believes  the  sentence 
£  if  £  is  inclined  to  assent  to  it,  if  asked,  or  if  X  has  a  (non- 
probabilistic)  disposition  to  act  as  if  it  were  true.  But  again  this  seems 
not  strong  enough  to  give  rise  to  an  interesting  normative  account  of 
rational  belief.^  I  think  of  myself  as  accepting  the  axioms  of  set  theory, 
for  example,  but  there  are  many  theorems  of  arithmetic  that  I  am  too 
ignorant  to  assent  to.  On  the  other  hand,  if  I  were  offered  a  proof  of 
such  a  theorem,  I  worxld  not  only  be  inclined  to  assent  to  it  following 
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the  proof,  hut  feel  that  I  had  been  conmitted  to  the  acceptance  of  that 
theorem  all  ailong.  When  I  accept  the  axioms  of  set  theory,  I  am  committing 
myself  to  accepting  their  consequences.  Of  course,  if  those  axioms  have 
untowaard  consequences  —  for  example,  if  they  should  prove  to  he  inconsistent 
I  will  not  then  regard  myself  as  committed  to  believing  all  the  sentences  of 
the  language.  Rather,  were  I  to  become  aware  that  the  axioms  of  set  theory 
commit  me  to  everything,  I  would  no  longer  accept  those  axioms,  but  would 
rather  seek  out  some  fixed  up  set  of  sucioms. 

I  shall  construe  full  belief  or  acceptance  as  commitment.  To  accept 
^  is  to  be  committed  to  P,  and  also  to  its  deductive  consequences.  But  we 
may  still  ask  how  far  this  committment  takes  us:  does  it  commit  us  to  the 

consequences  of  the  whole  set  of  sentences  we  accept?  Levi  Cl980)  argues 
that  it  does.  I  woxild  argue  against  this,  not  on  grounds  of  human 
finitude  or  logical  limitation  —  that  would  speak  against  being  committed 
to  all  the  theorems  of  set  theory  —  but  on  the  grovinds  that  it  does  not 
seem  plausible  to  demand  that  the  set  of  empirical  sentences  we  accept  be 
deductively  closed.  Here  are  four  examples: 

(l)  I  can  believe  of  each  statement  that  I  accept  in  a  certain  context 

that  it  is  true  Cor  else  I  wouldn't  accept  it)  and  also  reasonably  believe 
(some) 

that /one  of  them  is  false  —  I  can  believe  the  negation  of  the  conjunction 

k 

of  those  statements.  The  situation  can  be  represented  as 


B(£^) 


B(l^) 
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(2) '  The  lottery  problem  is  well  known.  If  reasonable  acceptance  is  to  be 
groxinded  in  probability  alone,  then  one  should  believe  of  any  specified 
ticket  in  a  fair  million  ticket  lottery  that  ^  will  not  win  the  grand  prize. 
But  the  conjunction  of  a  million  such  statements  contradicts  the  assertion 
that  the  lottery  is  fair.  Our  epistemic  state  is  highly  symmetrical  with 
respect  to  the  tickets;  the  evidence  that  any  given  ticket  will  lose  is 
overwhelming;  the  evidence  that  not  all  the  tickets  will  lose  is  even  more 
overwhelming.  To  accept  some  but  not  all  the  statements  of  the  form 
"ticket  ^  will  not  win"  is  grotesquely  arbitrary.  To  reject  them  all,  on 
the  gro\inds  of  the  very  symmetry  that  leads  to  the  problem  is  to  give  up 

too  much;  the  same  grounds  would  undermine  the  arguments  used  to  justify 
"the  rejection  of  the  null  hypothesis"  in  statistics.  Each  possible  sample, 
in  such  an  argument,  is  assumed  to  be  "equally  probable,"  and  the  null 
hypothesis  is  rejected  exactly  on  the  grounds  that  if  the  hypothesis  were 
true,  we  would  have  to  suppose  that  we  had  drawn  the  winning  ticket,  which 
is  too  improbable  to  be  believed. 

(3)  In  statistical  inference,  if  we  allow  oxirselves  to  speak  of  the 
probability  of  statistical  hypotheses  at  all,  there  are  cases  in  which  many 
hypotheses  have  the  same  high  probability,  but  in  which  their  conjunction 
is  inconsistent.  For  example,  consider  the  inference  from  a  sample  of  a 
noraal  population  of  unknown  mean  to  a  value  for  that  mean.  Given  the 
sample  mean,  the  probability  that  the  unknown  mean  lies  in  any  interval  is 

given  by  a  "fiducial"  distribution,  integrated  over  that  interval.  For 
any  number  1  -  e  less  than  1,  there  are  an  infinite  number  of  intervals  such 
that  the  fiducial  probability  is  1  -  e  that  the  unknown  mean  belongs  to  that 
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interval.  But  the  intersection  of  these  intervals  is  very  small.  We  can 
also  have  a  fiducial  probability  of  1  -  e  that  the  mean  does  not  belong 
to  this  intersection. 

(U)  In  measurement  theory  we  often  suppose  ourselves  to  be  making  a  large 
nvunber  of  direct  comparisons  of  objects-  Consider  a  set  of  n^  statements  of 
the  form  is  the  same  length  as  (The  first  board  for  our  book¬ 

shelves  is  the  same  length  as  the  second;  the  second  is  the  same  length  as 
the  third;  ...)  Clearly  each  of  these  statements  may  be  acceptable  when 
the  statement  is  longer  than  is  also  acceptable.  (The  first  board 

is  definitely  longer  than  the  last.)  There  are  no  grounds  for  choosing  among 
them-  But  to  reject  them  all  is  to  give  up  on  measurement.  One  might  say 
that  we  should  reject  them  all;  that  the  lesson  to  be  learned  is  that  we 
cannot  Judge  ''eq.uality  of  length"  directly.  But  approximate  equality  of 
length  won't  give  us  measurement  theory  (for  example,  it  isn't  transitive). 

How  do  sentences  become  accepted?  One  answer  is:  when  they  are 
probable  enough.  Suppose  that  a  sentence  £  is  so  highly  probable  as  to  be 
practically  certain,  relative  to  a  set  of  sentences  £  construed  as  a 
rational  corpus.  Do  we  want  to  regard  P  therefore  as  a  member  of  It 
is  awkward  to  do  so,  for  then  P  is  not  probable  or  practicsilly  certain, 
relative  to  S_,  but  (by  P-2)  certain,  relative  to  S^.  Furthermore,  we 
must  face  the  problem  of  saying  what  degree  of  probability  is  required 
for  acceptance  in  £.  One  way  of  approaching  these  problems  is  to 
distinguish  two  levels  of  rational  corpora.  Let  S  be  the  evidential 

corpus ,  and  S  be  the  corpus  of  practical  certainties .  We  then  adopt  the 
“E 

following  principle  of  rational  acceptance: 

Principle  I:  A  sentence  P  is  acceptable  in  the  corpus  of  practical 

certainties  indexed  by  the  real  number  £,  if  and  only  if  there  is 
an  evidential  corpus  S  ,  indexed  by  a  number  ^  larger  than  such 
that  the  minimum  probability  of  ^  relative  to  S  is  greater  than 
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In  virtue  of  P-6,  this  principle  yields  just  the  structure  we  have  been 
talking  about. 

For  example,  in  the  affairs  of  ordinary  life,  we  might  take  £  =  0.999 
and  P  =  0.99-  We  can  take  it  as  evidence  that  a  given  coin  is  fair:  that 
is,  we  can  include  a  statement  in  our  corpus  of  evidential  certainties 
—  QQQ  distribution  of  heads  in  tosses  of  this  coin  is  approximately 

binomially  distributed  with  a  parameter  close  to  a  half  —  say  .500  t  .010. 
Relative  to  S_  then,  the  probability  of  heads  on  a  single  toss  is 

[  .1*90,  .510]  •  But  if  we  consider  a  sequence  of  seven  tosses,  the  maximvim 
probability  of  getting  all  heads  is  less  than  .01;  the  minimum  probability 
of  not  getting  seven  heads  in  seven  tosses  is  at  least  0.99;  and  therefore 
we  may  regard  it  as  practically  certain  that  we  will  not  get  seven  heads  in 
the  next  seven  tosses.  "The  next  seven  tosses  will  not  all  result  in  heads," 
will  appear  in  S_  q,  out  corpus  of  practical  certainties. 

•  77 

How  about  S^,  the  evidential  corpus?  If  what  is  in  is  at  issue, 
we  construe  the  level  of  as  practical  certainty,  and  require  that  there 
be  a  proto-evidential  corpus,  S^'  relative  to  which  each  ingredient  of 
have  a  high  enough  probability.  In  other  words,  we  would  simply  apply  the 
same  principle  at  a  higher  level. 

In  the  previous  example  we  may  ask  for  the  grounds  on  which  we  can 
accept  in  ^  -q-  the  assertion  that  the  distribution  of  heads  is  binomial 

•  777 

with  a  parameter  of  .500  ±  .010.  To  answer  we  would  take  n  =  .999  and 
£  =  .999  Lsay).  We  include  in  our  knowledge  of  the  dynamics  of  coin 

•  777 

tossing  and  our  knowledge  of  the  design  of  coins  and  the  procedures  for 
minting  them.  The  probability  that  for  this  coin  the  distribution  of  heads 
is  approximately  binomial  with  parameter  .500  1  .01  is  at  least  .999 
relative  to  i 
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The  value  of  £,  the  level  of  probability  required  for  acceptance  into 
S  ,  concerns  us  both  as  a  parameter  in  the  normative  theory  of  rational 
belief  and  as  a  parameter  in  the  corresponding  descriptive  theory. 

Dretske  (1982),  among  others,  has  argued  that  "there  seems  to  be  r.o  non- 
axbitrary  place  to  put  a  threshhold  "  (p.  8).  There  are  tvo  answers  to 

this  claim. 

It  seems  inevitable  that  what  will  strike  us  as  an  appropriate  level 
of  £_  for  one  context  will  be  inappropriate  in  another.  For  the  normative 
theory,  therefore,  we  need  a  way  of  classifying  contexts.  My  suggestion 
is  this:  that  we  consider  relatively  global  contexts,  and  characterize 
them  in  terms  of  the  maximum  or  minimum  odds  at  which  our  chancey  decisions 
in  those  contexts  might  pay  off.  Thus  in  "ordinary  life"  we  do  not 
ordinarily  consider  gambles  involving  odds  of  greater  than  20:1.  This 
would  suggest  £•  =  0.95  as  a  suitable  acceptance  level.  There  are  special 
contexts  that  we  all  face  on  occasion  when  this  does  not  seem  appropriate: 
when  buying  insurance  for  example.  In  that  case  we  are  contemplating  a 
gamble  in  which  the  odds  may  be  100:1  or  1000:1  or  greater.  An  appropriate 
level  of  practical  certainty  might  then  be  .99  or  .999 •  In  scientific 
inquiry  or  in  public  policy,  the  stakes  may  be  similarly  extreme,  and  the 
level  of  acceptance  may  be  similarly  stringent. 

At  the  other  extreme,  some  conservative  academic  epistemologists  seem 
to  feel  that  we  can  avoid  (or  shoiild)  avoid  chancey  decisions.  Since  the 
maximum  payoff  you  can  offer  is  the  reciprocal  of  the  maximum  payoff  you  can 
receive,  this  suggests  that  the  range  of  odds  contemplated  in  the  epistemic 
context  is  close  to  1:1;  and  this  would  lead  to  a  value  of  very  close  to 
(just  over)  1/2.  ("You  can  reasonably  believe  P  if  it  is  more  probable  than 
not. "1 
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This  is  a  relatively  new  question,  and  so  far  as  I  know  relatively 
little  thought  has  been  devoted  to  it.  But,  as  far  as  the  normative  theory 
is  concerned,  it  certainly  seems  premature  to  claim  that  there  is  no  non- 
arbitrary  threshold. 

With  respect  to  the  descriptive  theory,  we  may  be  better  off.  The 
parameter  is  an  adjustable  one  whose  value  can  be  chosen  to  make  the 
most  sense  of  the  data  we  have.  Some  indication  that  this  might  be  a 
useful  way  to  approach  the  data  of  behavioral  decision  theory  will  be  given 
in  a  subsequent  section. 

There  are  a  number  of  consequences  01  Principle  I  that  are  worth 
remarking. 

Cl)  If  a  sentence  P  has  a  minimum  probability  greater  than  £  relative  to 
Cor  less  than  l-£) ,  it  is  not  rational  to  bet  against  it  (on  it) 
at  any  odds  in  the  range  characterizing  p-  "ntical  certainty.  It  is 
a  practical  c  •'•tainty,  a  datirm.  Eut  as  the  above  discussion  of  the 
va  le  of  £  suggests,  ^  one  is  actually  offer'»d  enormous  odds,  that 
in  itself  may  suffice  to  change  the  context  to  one  in  which  a  different 
Clarger)  value  of  £  is  appropriate.  In  an  ordinary  context,  I  simply 
accept  the  statement  that  about  half  the  tosses  of  this  coin  will  land 
heads.  But  if  the  relative  pay-off  were  great  enough,  I  would  consider 
a  bet  on  the  question  of  whether  or  not  this  coin  is  bl'ssed  — 
i.e.,  on  the  truth  of  a  statement  that  in  another  context  I  accepted 
as  a  datum. 

C2)  Approximate  statistical  statement^  of  the  form  ”%{A,2)  =  [p,q]"  can 

be  rendered  probable  enough  for  acceptance  by  observational  evidence 

in  S  . ^  In  its  crudest  form  this  inference  makes  use  of  (l)  the  set- 
-e 

theoretical  statistical  truth  thaa  whatever  be  the  proportion  of  A ' s 
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that  are  B's  the  proportion  of  large  subsets  of  A  that  have  a  frequency 
of  B's  close  to  that  among  A's  in  general  is  very  large;  (2)  the  sample 
of  A's  that  we  have  observed  is  a  random  member  of  the  set  of  those 
large  subsets  relative  to  what  we  know;  and  (3)  the  proportion  in 
the  sample,  known  to  have  100^  B^'s,  differs  by  6  from  the  proportion 
among  A's  in  general  if  and  only  if  the  general  proportion  lies  in 
the  interval  £  ±  6.  Thus,  (l)  whatever  be  the  proportion  n  of  black 
balls  among  draws  with  replacement  from  urn  u,  the  proportion  of  sets 
of  10,000  draws  that  have  a  relative  frequency  of  black  balls  differing 
by  less  than  .02  from  n  is  at  least  .98;  this  sample  of  10,000  has 
k0%  black  balls;  this  sample  is  a  random  member  of  the  set  of  samples 
of  10,000;  therefore  the  probability  is  f.  98,1.0]  that  £  e  [  .it0-.02,U0+.02] 
I.38,.U2],  Note,  however,  that  precise  statistical  statements  of  the 
form  "!S(A,B)  e  l£,£]"  and  in  particular  statements  of  the  form 
"?CA,B)  e  [1,1]"  corresponding  to  "All  A's  are  B's")  cannot  in  general 
be  rendered  highly  probable  by  observational  evidence. 

C3)  Statements  that  are  characteristic  of  the  language  —  i.e.,  statements 

that  are  tautologies  in  that  language  —  automatically  receive 

probability  1  and  are  automatically  accepted  in  S^.  Furthermore, 

given  any  such  statement  T,  and  any  statement  £  probable  enough  to  be 

accepted  in  S  ,  their  conjunction  will  be  probable  enough  to  be 

accepted  in  S  ,  and  therefore  (by  P-6)  so  will  any  deductive  consequence 
"E. 

of  their  conjixnction. 

CU)  From  (2)  and  (3)  it  follows  that  any  universal  generalizations  — 

"All  A's  are  B's",  for  example  —  that  are  in  are  to  be  construed 
as  tautologies  of  the  language  L,  or  else  are  to  be  construed  as 
approximate  statistical  generalizations;  "Almost  all  A's  are  B's". 
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(it  is  an  a  priori  constraint  on  our  language  that  all  hears  are  majnnals 
it  is  an  arbitrary  riile  of  post  office  procedure  that  all  sealed 
letters  must  have  6d  stamps;  but  it  is  only  "approximately  true"  that 
all  people  vho  go  to  three  drugstores  in  a  row  have  failed  to  find 
what  they  wanted  in  the  first  two.  It  is  more  accurate  to  say  that 
it  is_  true  that  almost  all  people —  )  As  will  become  clear  later, 
this  is  an  important  distinction,  since  "All  A's  are  B’s  entails  its 
contrapositive,  "All  non-^'s  are  non-A's",  while  the  approximate 
statement,  "Almost  all  A's  are  ^’s"  does  not  entail  "Almost  all 
non-B's  are  non-A's." 

(5)  It  is  possible  to  argue  that  since  much  of  scientific  theorizing  is 
concerned  with  the  establishment  of  truly  universal  generalizations  , 
it  is  better  to  look  on  it  in  terms  of  the  choice  between  languages 
characterized  by  different  meaning  postulates  or  tautologies,  than  as 

a  matter  of  testing  or  attempting  to  falsify  universal  generalizations. 
If  this  is  so,  it  may  explain  some  of  the  difficulties  surrounding 
attempts  to  explore  or  inculcate  a  Popperian  approach  to  scientific 
inference. 

(6)  A  deductive  argument  will  show  that  the  corpus  is  committed  to  the 

statement  P  only  when  the  conjunction  of  the  premises  of  the  deduction 
is  in  S^.  We  require  'P^  A -"A  Zq’  ^  and  ^ , . . .  ,Pq}  V— P,  and 

not  merely  {Pq_  , . . .  ,P^}  \ —  P. 

A  classic  difficulty  in  the  theory  of  knowledge  has  stemmed  from  the 
urge  to  regard  observation  (or  perception)  as  an  incorrigible  foundation 
of  knowledge,  and  at  the  same  time  to  recognize  that  observation  — 
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particularly  in  science,  and  particularly  in  the  form  of  measurement  —  must 
be  regarded  as  fallible.  It  is  clear  that  we  must  admit  the  results  of 
observation  into  our  evidential  corpus  if  we  are  to  learn  from  experience 
at  all.  Yet  it  is  also  clear  that  however  confident  we  may  be  of  an 
observation,  there  is  a  possibility  (which  often  cannot  simply  be  "ignored") 
that  it  is  in  error. 

The  framework  already  proposed  suggests  a  resolution  of  this  difficulty. 
In  learning  a  language,  whether  ordinary  language  at  the  age  of  two,  or 
the  specialized  language  of  livestock  Judging  at  the  age  of  thirty-two, 
one  is  learning  to  make  observational  Judgments  (among  other  things). 

One  also  learns  that  one’s  Judgments  are  fallible:  one  never  achieves 
perfection.  Errors  may  be  pointed  out  by  one’s  teachers,  or  they  may 
become  obvious  through  conflicts  with  other  things  —  including  generali¬ 
zations  characteristic  of  the  language  —  one  knows.  In  either  event, 
one  can  learn  from  one's  niistakes:  one  can  learn  (in  a  metalinguistic 
rational  corpus)  that  observational  Judgments  of  kind  K  are  wrong  some 
small  portion  of  the  time:  say  about  e. 

Consider  a  sentence  ^  reflecting  an  observational  Judgment  of  type  K. 

If  P  is,  relative  to  what  one  knows  about  it,  a  random  member  of  K,  the 
probability  that  it  is  mistaken  is  about  e.  If  1  -  e  is  less  than  £,  P 
may  be  accepted  as  practically  certain  —  as  a  member  of  S^.  Note  that  P 
may  be  a  random  member  of  K  at  one  time,  and  cease  to  be  a  random  member 
of  K  when  new  information  relevant  to  the  chance  that  ^  is  mistaken 
becomes  available. 

In  fact,  since,  by  P-8,  HS^,K)  =  ^£>2.1  if*  oiiiy  if*  P  ('^S^,K)  = 
[l-£^,l-£],  it  is  perfectly  possible  to  have 
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P(S,K)  s  [£.»£]  :  S  is  practically  certain  relative  to  K 

P(~S,  KUK')  =  [l-5i',l-£']  :~S  is  practically  certain  relative 

to  an  expansion  of  K  by  K' . 

Even  observational  statements  may  come  to  be  accepted  and  then  come  to  be 
rejected  in  the  light  of  new  information  on  the  model  suggested.  For 
example,  one  may  accept  on  the  basis  of  observation  that  there  is  a  cat 
on  the  porch,  and  then,  on  closer  investigation,  which  results  in  the  addition 
of  new  evidential  statements  to  one's  rational  corpus,  be  led  to  reject 
that  statement  and  to  accept  the  statement  that  it  is  not  a  cat  (but,  say, 
a  mongoose)  on  the  porch.  That  does  not  mean  that  one  was  not  Justified 
in  first  accepting  that  it  was  a  cat.  A  more  common  example  is  to  accept 
on  the  basis  of  measurement  that  the  melting  point  of  compound  X  is 
k  ±  ^  degrees,  and  then,  on  the  basis  of  more,  and  more  careful,  measurements, 
to  reject  that  first  measurement  as  yielding  an  "outlier". 

The  principle  that  Justifies  this  is  essentially  Just  a  metalinguistic 
version  of  principle  I.  It  is  phrased  in  terns  of  error,  and  it  refers  to 
the  evidential  corpus  rather  than  for  convenience.  Recall  that  both 
e  and  £  atre  numbers  adjustable  to  fit  the  context. 

A  sentence  P  is  acceptable  in  the  corpus  of  evidential  certainties 
indexed  by  the  real  number  £  on  the  basis  of  observation  if  and 
only  if  the  probability  that  P  is  mistaken,  relative  to  the  meta- 

evidential  corpus  MS^,,  is  less  than  1  -  £• 

The  meta-evidential  corpus  embodies-  our  knowledge  about  the  frequency  of 

mistakes  among  statements  of  various  classes  based  on  observation. 

Thus  Principle  I  provides  for  the  acceptance  of  fallible  observation 
statements.  Technical  object-language  metalanguage  complications  arise  in 
the  general  application  of  Principle  I  to  both  observational  and  directly 
statistical  uncertainties.  These  complications  have  been  discussed  in  some 
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detail  in  (Kybiirg,  198312);  they  have  only  a  marginal  bearing  on  the  issues 
with  which  we  are  concerned  here,  and  will  not  be  further  discussed. 

Given  the  syntactical  notions  of  provability  and  probability,  then, 
and  given  a  formal  language  L  in  which  the  beliefs  of  an  individual  can 
be  expressed,  we  take  Principle  I  to  say  all  there  is  to  say  about  full 
belief. 

5.  The  Principles  of  Partial  Belief. 

The  subjective  interpretation  of  probability  generally  supposes  that 
a  degree  of  belief  can  be  represented  by  a  real  number  in  the  interval 
[0,1].  Various  coercive  procedures  —  e.g. ,  forced  bets  — 
have  been  suggested  as  a  way  of  determining  the  degrees  of  belief  of 
experimental  subjects.  At  the  same  time,  some  writers  (Savage  (1966), 

Jeffrey  (197^)>  Good  (1962))  have  suggested  that  beliefs  be  characterized 
on  more  than  one  dimension:  one  might  feel  "less  secure"  about  one's 
degree  of  belief  of  .31^9725  that  the  next  president  will  be  a  Republican, 
than  about  one's  degree  of  belief  of  .31^9725  that  of  the  next  75  flips 
of  this  coin,  between  39  and  35  will  result  in  heads. 

It  has  also  been  suggested  (Smith  (196I),  Dempster  (1968),  Kyburg  (196I), 
Good  (1962),  Shafer  (1976),  Levi  (197^))  that  one  way  to  capture  this 
dimension  is  through  the  use  of  intervals  to  represent  degrees  of  belief. 

Smith  (1961)  suggests  that  a  first  approach  to  a  behavioral  counterpart 
of  an  interval  of  belief  would  be  given  by  the  least  odds  at  which  a 
person  would  be  willing  to  bet  on  P,  and  the  least  odds  at  which  he  would 
be  willing  to  bet  against  In  general  one  might  suppose  that  these 

odds  are  not  reciprocals,  and  that  they  would  define  an  interval.  This 
approach  is  rather  crude,  and  is  excessively  sensitive  to  the  person's 
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Interest  in  or  distaste  for  gambling.  A  more  accurate  way  of  measuring 
intervals  of  belief  would  be  of  considerable  interest. 

Even  without  a  way  of  getting  at  interval  beliefs  directly,  however, 
we  can  apply  the  present  theory.  If,  relative  to  an  agent's  body  of 
practical  certainties,  the  probability  of  the  statement  P  is  the  interval 
,  then  it  is  clearly  irrational  of  him  to  be  willing  to  pay  more  than 

£  dollars  for  a  ticket  that  yields  a  dollar  if  £  is  true  and  nothing 
otherwise.  Similarly,  it  would  be  irrational  of  him  to  sell  a  ticket 
for  less  than  £  dollars  that  obligated  him  to  pay  a  dollar  if  P  is  true. 
Thus  we  can  use  many  of  the  standard  ways  of  getting  at  alleged  real 
number  degrees  of  belief  to  apply  the  present  theory.  We  can  put  the 
matter  more  generally  by  saying  that  a  person’s  behavior  with  respect 
to  a  sentence  P  should  be  compatible  with  his  having  a  (hypothetical) 
degree  of  belief  in  P  that  falls  within  the  interval  representing  the 
probability  of  P. 

We  state  this  as  Principle  II: 

Principle  II:  If  is  X's  corpus  of  practical  certainties  in  a  certain 
context,  then  X's  degree  of  belief  in  a  sentence  P,  whether  it  is 
construed  as  an  interval  of  the  form  [3.,£] ,  or  as  a  non-degenerate 
Interval,  should  be  included  in  Prob(P,S^)  =  • 

To  borrow  from  an  earlier  example,  if.  relative  to  my  body  of  evidentiaLL 
certainties,  the  probability  that  John  will  go  to  the  movies  tonight 
is  [.6,.?],  then  my  "degree  of  belief"  that  John  will  go  to  the  movies 
tonight  should  be  some  number  —  e.g.,  0.62  —  in  that  interval,  if  we 
construe  degrees  of  belief  as  measured  by  real  numbers,  or  should  be  some 
subinterval  of  [.6,. 7]  —  e.g.,  [.63,. 65]  —  if  we  take  "degrees  of  belief" 
to  be  better  represented  by  intervals. 
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Finally,  we  need  a  principle  connecting  degrees  of  belief  and  utilities 
with  actions.  We  adopt  the  principle  of  maximizing  expected  utility  as 
our  third  principle  of  rationality.  We  assime  that  X  has  a  real -valued 
utility  function  defined  over  statements.  (This  is  a  questionable  supposition 
but  one  we  shall  not  question  here.)  The  outcomes  of  decisions  or  choices 
can  be  described  as  the  coming  true  of  certain  propositions.  Of  course 
it  is  the  utility  of  the  total  world  outcome  that  concerns  the  actor, 
but  in  common  with  most  writers,  I  shall  assume  that  in  artificially 
simplified  situations  we  can  often  work  with  the  marginal  utilities 
associated  with  sentences  describing  partial  outcomes:  winning  a  dollar 
if  the  coin  lands  heads,  and  losing  a  dollar  if  the  coin  lands  tails.  This 
simplification  is  defensible  on  the  view  that  we  are  considering,  since 
even  though  it  is  possible  that  I  should  lose  my  entire  fortune  between  now 
and  the  tossing,  so  that  my  last  dollar  would  be  very  valuable,  and  even 
though  it  is  possible  that  my  competitor  should  welsh  on  his  bet,  it 

is  under  ordinary  circumstances  practically  certain  that  neither  of  these 
possibilities  will  be  realized. 

Since  probabilities  are  intervals,  even  if  we  assume  that  utilities 
are  real  valued,  expected  utilities  must  also  be  construed  as  intervals: 

The  expected  utility  of  P ' s  being  true  is  the  interval  comprised  by  the 
utility  of  P's  being  true,  multiplied  by  the  lower  probability  of  P,  aind 
the  utility  of  ^'s  being  true  multiplied  by  the  upper  probability  of  P. 

If,  relative  to  my  corpus  of  practical  certainties,  the  probability  of  P 
is  [i,r] ,  and  my  utility  for  P  is  V,  the  expected  utility  of  P  is  [Va,,Vr ] . 

Of  course  one  cannot  simply  "maximize"  an  interval,  so  the  principle 
of  maximizing  expected  utility  cannot  be  phrased  in  the  usual  way.  But 
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one  thing  does  seem  clear,  and  that  is  that  it  is  irrational  to  choose 
an  action  whose  maxim\am  expected  utility  is  less  than  the  minimum 
expected  utility  of  some  other  action.  This  limits  our  choices,  but  in 
genereil  need  not  pick  out  a  \inique  one  as  "rational  It  might  be  that 
some  further  constraint  coxild  be  imposed  (for  example:  choose  the  action 
whose  minimum  expected  utility  is  a  maximum)  but  this  seems  to  come  into 
competition  with  contrary  constraints  (for  example,  choose  the  action 
whose  maximum  expected  utility  is  a  maximum).  It  thus  seems  sensible, 
at  this  point,  to  limit  ourselves  to  the  following  principle  of  rational 
action: 

Principle  III:  In  a  situation  in  which  X's  corpus  of  practical  certainties 


is  S  ,  X  ought  rationally  to  reject  any  choice 


■’•'.ere  is 


a  C,  whose  minimum  expected  utility  exceeeds  the  maximum  expected 
utility  of  . 

For  example,  suppose  that  choice  1  yields  A  with  probability  [.1,.2] 
and  not-A  with  probability  [.8,. 9]  and  that  the  utility  of  A  is  10, 
and  of  not-A  is  -1.  The  expected  utility  of  is  [l,2j  +  [  -.8, -.9]  = 
I.2,l.lJ.  Similarly,  suppose  that  has  an  expected  utility  of  [l.5,2.3] 
and  that  an  expected  utility  of  l0.8,5.0j.  The  rule  then  requires 

rejection  of  but  does  not  legislate  between  and 
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6.  Deductive  Performance. 

There  are  a  number  of  studies  that  purport  to  show  that  people  are 
deficient  in  their  deductive  performance  or  comr ''tence  or  both.  Of 
course,  few  people  are  brilliant  logicians,  and  even  brilliant  logicians 
cannot  be  faulted  for  not  living  up  to  ad.1  of  their  deductive  commitments. 
Nobody  is  aware  of  all  the  theorems  of  set  theory,  though  many  people  are 
committed  to  them.  The  studies  of  deductive  performance  therefore  ordinarilir 
concern  the  failure  of  subjects  to  make  relatively  simple  deductive 
inferences.  One  quite  robust  deficiency  in  this  regard,  apparent  in  both 
simple  and  complex  tasks,  has  been  called  "confirmation  bias"  —  the 
tendency  of  people  to  look  for  or  take  account  of  evidence  confirming  a 
hypothesis  rather  than  evidence  falsifying  a  hypothesis. 

One  group  of  experiments  in  which  subjects  were  to  formulate  hypotheses, 
test  them,  modify  them,  and  so  on,  under  circumstances  designed  to 
simulate  those  in  which  real  scientists  work  was  reported  by  ^^att  et  , 
(1977)  and  (1978).  In  both  series  of  experiments,  the  investigators  dis¬ 
covered  a  "confirmational  bias."  The  subjects  were  tempted  (1977,  p.89) 
to  formulate  a  hypothesis  that  was  incorrect  to  account  for  the  initial 
data  with  which  they  were  presented.  They  were  subsequently  given  a  choice 
between  pairs  of  experimental  setups,  some  of  which  could  falsify  that 
hypothesis,  others  of  which  coiild  provide  disconfirming  evidence  for  it, 
or  suggest  alternative  hypotheses,  and  others  of  which  could  only  provide 
confirming  data.  Many  of  the  subjects  chose  to  examine  evidence  that 
could  only  provide  confirming  evidence  for  their  initial  hypothesis,  even 
when  they  had  received  instructions  to  the  effect  that  it  was  the  Job  of 
the  scientist  to  "disconfirm  theories  and  hypotheses."  "Subjects  who 
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started  with  triangle  hypotheses,  regardless  of  [whether  they  were  told 
to  confirm  or  to  disconfirm  hypotheses]  chose  at  a  much  higher  than  chance 
rate  screens  [presenting  evidence]  which  could  only  confirm  such  hypotheses." 
(p.  93)  On  the  other  hand,  "subjects  could  use  falsifying  data  when  they 
got  it."  (p.  9^) 

This  suggests  that  the  subjects  were  not  (for  the  task  at  hand) 
deficient  in  deductive  prowess,  but  deficient  in  strategic  awareness  when 
it  came  to  testing  universal  generalizations.  The  second,  more  complex, 
series  of  experiments  confirmed  the  results  of  the  first,  "even  though 
the  instnictional  manipulations  were  much  more  extensive  than  in  the 
earlier  study."  (1978,  n.  UoU)  On  the  other  hand,  in  this  more  complex 
task  subjects  did  nr'  -  jnost  always  abandon  disconfimed  hypotheses. 

The  authors  do  n>./t  suggest  deductive  incompetence  as  the  reason,  but 
write:  "the  more  complex  environment  may  have  made  generation  of  new 

hypotheses,  and  hence,  abandonment  of  disconfirming  hypotheses,  more 
difficult."  In  the  second  series,  the  subjects  were  explicitly  told  that 
the  phenomena  obeyed  a  \iniform  set  of  deterministic  laws,  but  presumably 
that  information  was  also  implicit  in  the  instructions  to  the  first  set  of 
subjects. 

It  is  quite  clear  that  the  authors  began  (and  perhaps  ended)  with 
the  conviction  that  some  form  of  Popperian  (Popper,  1959)  approach  to 
hypothesis  testing  was  appropriate  (Mynatt  et  slI.  (1977),  p.  85),  Mynatt 
et  al.  (1978),  p.  396).  Ey  the  end  of  the  second  series,  apparet ;ly  some 
doubts  had  been  raised  (Mynatt  et  al.  (1978),  p.  ^+05)-  But  even  then,  they 
have  not  abandoned  the  phrase  "confirmation  bias , "  even  though  they  suggest 
that  the  "confirmation  bias  may  not  be  completely  counterproductive"  (p.  U05)- 

On  the  view  sketched  earlier,  the  confirmation  "bias"  is  generally 


Kyburg 


page 


perfectly  appropriate.  In  the  "natural  ecology"  (Einhorn  and  Hogarth  (198I)) 
few  universal  generaLLizations  present  themselves  as  live  possibilities; 
indeed,  on  the  view  suggested,  universal  generalizations  can  be  maintained 
only  by  being  made  features  of  the  scientific  language.  The  basic  inductions 
which  can  serve  as  a  guide  in  life  are  statistical.  It  is  obviously 
utterly  irrelevant  to  look  at  non-B's  for  evidence  concerning  "?(A,B)  t 
This  is  so  however  close  £  may  be  to  1.  One  does  not  look  among  non-B^'s  for 

evidence  either  supporting  or  controverting  the  statement  "Almost  all  A's 
are  B's."  Only  A's  are  relevant. 

At  a  more  sophisticated  level  of  science  —  note  that  this  did  not 
become  a  significant  part  of  empirical  knowledge  until  the  last  few 
hundred  years  —  one  does  encounter  gentiine  tmiversal  generalizations. 

But  even  then,  as  Kuhn  (1962)  and  Feyerabend  (1970)  have  pointed  out,  such 
generalizations  are,  particularly  early  on,  maintained  in  the  face  of 
falsifying  evidence.  That  is,  they  are  maintained  "come  what  may";  the 
characteristic,  as  Quine  (1961)  has  suggested  of  linguistic  conventions. 

Every  student  knows,  of  course,  that  there  are  universal  generalizations 
in  real  science.  Most  students  are  told  (alas)  that  they  are  empirical 
and  falsifiable.  But  every  student  has  also  experienced  the  apparent 
falsification  of  an  accepted  universal  law  —  if  only  Archimedes'  law  in 
a  physics  laboratory  —  only  to  be  told  that  the  reason  for  the  apparent 
falsification  is  that  he  has  "not  done  the  experiment  correctly"  or  "not 
interpreted  the  resvLLts  correctly." 

The  students  who  were  the  subjects  of  the  experiments  of  >ynatt  et  al. 
were  therefore  in  a  bind:  they  are  torn  between  the  natural  rational  attempt 
to  confirm  a  hypothesis  of  the  form  "Almost  all  A's  are  B^'s,"  and  the 
knowledge  that  there  is  a  universal  relation  to  be  found,  while  at  the 
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same  time  they  know  that  \iniversal  relations  in  general  survive  apparent 
disconfirming  instances.  It  is  no  wonder  that  they  reported  being  frustrated 
(^^ynatt  ^a2.  (1978),  p.  Uo4).  Mynatt  et  also  report  (C19T8),  p.  U05) 
that  "the  three  subjects  who  quickly  abandoned  disconfirmed  hypotheses 
did  not  rapidly  progress." 

If  clever  students  are  given  a  list  of  silternative  hypotheses ,  one 

of  which  is  known  to  be  true,  one  would  conjecture  that  with  relative 

efficiency  they  would  proceed  to  falsify  hypotheses  until  there  was  but 

one  left.  If  they  are  given  the  information  that  there  is  a  useful  pattern 

for  prediction  to  be  discovered,  one  would  conjecture  that  they  woiild 

(correctly,  rationally)  seek  to  confirm  a  statistical  hypothesis  of  the 

form  "Almost  all  A's  are  B's"  or  "Almost  none  of  the  A's  are  B^'s"  by 

examining  the  A's.  If  they  are  told  that  there  exists  a  universal 

hypothesis,  but  need  to  make  up  their  own  list,  one  would  conjecture  that 

they  would  oscillate  between  the  two  approaches:  looking  for  useful  clues 

to  support  a  hypothesis  of  the  form  "Almost  all  A's  are  B^'s,"  and  supporting 

it  by  looking  for  more  A's;  considering  a  finite  and  nonexhaustive  list  of 

items  in  the  list 

hypotheses  of  the  form  "All  A’s  are  B_'s"  sind  perhaps  rejecting,^  in  the 
face  of  falsifying  evidence,  or  perhaps  modifying  them.  One  would  expect 
Just  such  a  confused  picture  as  that  revealed  by  the  protocols  of  the 
experiments  cited. 

Closely  related  to  the  data  Just  reviewed  are  the  data  provided  by 
Wason  (i960)  and  Johnson-Laird  and  Wason  (1977).  In  these  simpler  experi¬ 
ments  the  subjects  are  presented  with  four  cards  showing  A,  D,  4,  and  7. 

They  are  presented  with  the  statement:  If  a  card  has  a  vowel  on  one  side. 
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it  has  an  even  number  on  the  other  side.  Their  task  is  to  determine  which 
cards  must  be  t\irned  over  in  order  to  know  whether  the  statement  is  true  or 
false.  The  subjects  do  not  do  well;  many  say  that  the  cards  marked 
A  and  U  should  be  turned  over;  some  Just  mention  card  A.  Few  give  the 
correct  answer,  which  is  A  and  7* 

Again,  we  seem  to  have  encountered  "confirmation  bias."  There  is  no 
doubt  in  this  simple  case  that  subjects  are  answering  incorrectly.  But 
there  are  several  factors  that  can  be  called  on  to  account  for  their 
mistakes.  First,  to  account  for  the  answer  "A  only,"  the  previous 
suggestion  may  apply;  In  the  "nattural  ecology"  the  generalizations  with 
which  people  mainly  deal,  and  the  ones  which  they  habitually  confirm, 
are  essentially  statistical:  "Almost  all  X's  are  Y's."  For  testing  these 
generalizations,  only  obseirvations  of  X’s  are  appropriate.  The  natural 
tendency  to  treat  a  generalization,  "If  something  is  an  X  then  it  is  a  Y," 
as  representing  "Almost  all  X's  are  Y's,"  and  to  test  it  by  looking  only 
at  X's,  carries  over  to  the  artificial  task  in  which  the  natural  tendency 
is  incorrect. 

Second,  it  is  not  uncommon  in  ordinary  English  to  use  the  conditional, 
"If  something  is  X  then  it  is  If"  '*^0  express  what  is  more  accurately 
expressed  by  a  biconditional.  This  comes  about,  I  conjecture,  because 
under  many  circumstances  it  is  already  known  —  already  an  item  in  the 
corpus  of  practical  certainties  of  both  speaker  and  listener  —  that  ^'s 
are  X's.  "If  the  bay  is  too  wet,  it  will  mold,"  will  ordinarily  (and 
correctly)  be  understood  as  having  the  same  meaning  as:  The  hay  will  mold 
if  and  only  if  it  is  too  wet.  Combined  with  the  first  tendency,  this  woiild 
lead  subjects  to  examine  both  A  and  U. 
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It  is  interesting  to  contrast  the  results  of  this  experiment  with 
the  results  of  a  "formally”  similsir  experiment  ( Johnson-Laird  ^  , 

19T2)  cited  by  Johnson-Laird  and  Wason  Cl977).  "The  subjects  were 
instructed  to  imagine  that  they  were  postal  workers  engaged  in  sorting 
letters  on  a  conveying  belt;  their  task  was  to  determine  whether  the 
following  rule  had  been  violated:  'If  a  letter  is  sealed,  then  it  has 
a  5^  stamp  on  it.'"  The  material  consisted  of  the  back  of  a  sealed 
envelope,  the  back  of  an  unsealed  envelope,  the  front  of  an  envelope  with 
a  5d  stamp,  and  the  front  of  an  envelope  with  a  Ud  stamp.  Almost  all  the 
subjects  correctly  chose  to  examine  both  the  sealed  envelope  with  the  unseen 
face,  and  the  envelope  with  the  stamp. 

Cohen  (198I,  p.  32U)  suggests  that  the  difference  is  due  to  "familiarity 
and  concreteness  in  the  letter  sorting  task."  There  may  be  elements  of 
this,  but  a  more  salient  distinction,  in  the  framework  being  suggested 
here,  is  that  the  rule  in  the  Postal  example,  is  a  genuine,  stipulative, 
a  priori,  rule:  All  sealed  letters  shall  have,  must  have  5^  stamps.  It 
is  not  the  rule  that  is  being  tested,  but  the  conformity  of  the  letters 
to  the  rule.  It  would  be  interesting  to  test  the  performance  of  subjects 
on  a  similarly  concrete  and  familiar  task  where  the  "rule"  is  not  a 
stipulative  one,  but  a  descriptive  generalization  such  as:  If  a  letter 
has  a  5d  stamp,  it  has  the  return  address  on  the  back. 

Some  of  the  earliest  work  on  the  relation  between  logic  and  thinking 
was  done  by  Mary  Henle  (1962).  She  presents  evidence  "that  even 
where  the  thinking  process  results  in  error,  it  can  often  be  shown  that 
it  does  not  violate  the  rules  of  the  syllogism.  Many  errors  were  found 
to  be  accounted  for  not  in  terms  of  a  breakdown  of  the  deductive  process 
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itself,  but  rather  in  terms  of  changes  in  the  material  from  which  the 

(p.  377) 

reasoning  proceeds.” {  The  material  used  in  her  studies  was  deliberately 
chosen  to  be  informal,  and  her  subjects  were  (as  far  as  possible) 

"logically  naive."  Under  these  circumstances  it  would  be  diffictilt 
indeed  to  ensure  that  the  reasoning  processes  of  the  subjects  did  not  use 
material  from  their  own  bodies  of  knowledge. 

In  ordinary  argument,  this  dependence  on  a  body  of  practical 
certainties  is  even  more  pronounced.  Johnson-Laird  (1977)  offers  the 
example  (from  Abelson  and  Reich,  1969 ):  "He  went  to  three  drugstores, 
therefore  the  first  two  drugstores  didn't  have  whan  he  wanted."  In  such 
cases  it  is  clear  that  the  argument  is  not  deductive:  nobody's  corpxis  of 
practical  certainties  excludes  the  possibility  of  the  premise  being  true 
and  the  conclusion  false.  The  argument  is  of  the  statistical  "almost 
always  "  form.  It  is  also  clear  that  the  persuasiveness  of  the  argument, 
its  probabilistic  soundness,  depends  on  two  things  that  can  be  represented 
in  the  framework  suggested.  First,  enough  knowledge  in  the  rational  corpus 
(of  both  arguer  and  listener)  to  warrant  the  inclusion  of  the  statistical 
statement:  "Almost  always  when  a  person  goes  to  three  drugstores,  it  is 
because  the  first  two  didn't  have  what  he  wanted."  (This  is  noted  by 
Johnson-Laird.)  And  second  (not  remarked  on  by  Johnson-Laird),  knowledge 
of  the  drugstore  visiting  person  which  sillows  him  to  be  a  random  member 
of  the  set  of  people  who  visit  three  drugstores  with  respect  to  having  a 
particular  reason  for  doing  so.  Consider  the  difference,  for  example,  if 
the  totally  ambiguous  "he"  in  the  argument  is  replaced  by  "Tom"  —  the 
argTJment  still  goes  through  —  and  if  it  is  replaced  by  the  definite 
description,  "the  oldest  comparison  shopper  employed  by  Rite-Aid"  —  the 


argument  fails. 
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7.  Probabilistic  Inference. 

Some  of  the  most  recent  and  popxxlar  vork  on  human  inference  making 
concerns  the  ailleged  deficiencies  in  the  ability  of  people  to  perform 
probabilistic  inference  correctly.  In  itself,  this  is  not  svirprising; 
statistical  argument  is  more  complex  than  deductive  argument  by  its  very 
nature.  Indeed,  its  principles  have  yet  to  be  formulated  in  a  way  that 
conforms  to  the  intuitions  of  professional  statisticians.  (This  raises 
a  difficulty  for  the  suggestion  of  Stitch  and  Nisbet  (1980)  that  one  should 
turn  to  "experts"  for  criteria  of  justification.  In  statistical  inference 
there  sure  large  groups  of  acknowledged  experts  upholding  contrsuy  criteria 
of  Justification. ) 

One  well  known  example  (Kahneman  and  Tversky  (1973))  concerns  the 
"neglect  of  base  rates."  In  this  experiment  subjects  are  told  that  in  a 
certain  town  855J  of  the  cabs  are  blue,  and  15%  of  the  cabs  are  green. 

A  witness  to  an  accident  identifies  a  cab  as  green,  and  it  is  given  that 
under  the  circumstances  he  can  make  correct  identifications  of  color 
80JJ  of  the  time.  The  subjects  were  then  asked  for  the  probability  that  the 
cab  involved  in  the  accident  was  blue. 

The  medisin  estimated  probability  was  0.2.  The  authors  claim  that  this 
shows  a  serious  error,  since  the  contingency  table  employing  the  general 
relative  frequency  of  blue  and  green  cabs  looks  like  this: 


truly  blue 


seen  as  blue  seen  as  green 


.63 

.17 

.03 

.12 

truly  green 


.15 
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H  ■  conditional  probability  that  a  cab  is  blue,  given  that  the  witness  says 
it  is  green,  is  thus  .17/C. 17  +  .12)  =  .59. 

Cohen  (1981)  claims  that  this  is  no  error  at  all  —  that  the  subjects 
are  right  and  the  investigators  wrong.  "The  fact  that  cab  colours 
actually  vary  according  to  an  85/15  ratio  is  strictly  irrelevant  to  this 
estimate  because  it  neither  raises  nor  lowers  the  probability  of  a  specific 
cab-colour  identification  being  correct  on  the  condition  that  it  is  an 
identification  by  the  witness.  A  probability  that  holds  uniformly  for  each 
of  a  class  of  events  because  it  is  based  on  causal  properties,  such  as  the 
physiology  of  vision,  cannot  be  altered  by  facts,  such  as  chance  distributions 
that  have  no  causal  efficacy  in  the  individual  events"  (p.  328-329)- 

Cohen's  argtaaent  seems  wrong.  Suppose  the  story  were  changed;  suppose 
that  it  is  given  as  a  pure  problem  in  inference.  Suppose  the  cab  in 
question  were  not  singled  out  by  having  been  in  an  accident,  but  was 
selected  by  some  stochastically  random  procedure  from  among  the  cabs  in 
the  town.  Otherwise  the  story  is  the  same.  Regardless  of  the  "causal  basis" 
of  the  witness's  identification  of  the  color  of  the  cab,  it  is  clear  that  the 
contingency  table  wovild  give  the  correct  probability  that  the  cab  is  blue:  .59. 

An  experiment  like  this  was  described  by  Lyon  and  Slovic  (1976): 

The  nmbers  are  kept  the  same,  but  the  population  is  a  population  of  light- 
bulbs,  of  which  15/5  are  defective,  and  the  "witness"  is  a  scanning  device 
which  is  60%  accurate.  The  lightbulb  in  question  was  explicitly  said  to  be 
chosen  at  random.  Again,  it  was  discovered  in  this  experiment,  as  well  as 
in  a  large  variety  of  similar  experiments,  that  the  subjects  tended  to  ignore 
the  base  rate,  or  to  give  it  insufficient  weight. 

Nevertheless,  there  is  a  difference  between  the  experimental  results  for 
the  two  problems,  as  reported  by  Lyon  and  Slovic.  In  a  version  of  the  cab 
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problem  in  which  the  probability  of  green  was  asked  for,  the  median  estimatt 
was  .80.  In  the  corresponding  lightb\ilb  problem,  the  median  estimate  was 
also  .80.  But  the  interquartile  range  was  different  in  the  two  problems: 
in  the  cab  problem  it  was  reported  as  .8O-.8O.  In  the  light  bulb  problem  it 
was  .25  -  .80,  where  the  correct  answer  is  .Ul.  This  difference  is  revealing: 
it  suggests  that  a  significant  number  of  the  subjects  in  the  lightbulb 
problem  did  attempt  to  take  accoTint  of  the  base  rate,  while  practically 

none  of  the  subjects  in  the  taxicab  problem  did  so. 

One  possible  explanation  of  this  would  lie  in  the  fact  that  no  information 
about  the  relative  frequency  with  which  blue  amd  green  cabs  are  involved  in 
accidents  is  given  in  the  problem.  It  would  be  interesting  to  see  if  the 

results  were  more  like  those  of  the  lightbulb  problem  if  it  were  stated  that 

15?  of  the  cabs  involved  in  accidents  were  green  and  83%  were  blue.  It 
would  also  be  interesting  to  ask  the  subjects  in  the  original  taxicab  problem 
to  estimate  the  proportion  of  accidents  involving  cabs  that  involve  blue 
cabs.  Would  it  be  the  canonical  85??  Or  woiild  subjects  say  "there  isn't 
enough  information"?  Or  would  subjects  (improperly)  infer  from  the  one  case 
they  "know"  about  that  blue  cabs  are  less  likely  to  be  involved  in  accidents 
than  green  cabs? 

It  is  worth  observing  that  there  is  a  reconstimction  of  the  cab  problem 
which  strongly  supports  the  intuitions  of  the  subjects  who  ignore  base  rates. 
Suppose  the  corpus  of  practical  certainties  of  the  subject  contains  the  follovin 
statistical  knowledge 

(1)  !! (Cabs , blue)  =  .85 

(2)  /S(Cabs  identified  as  green,blue)  =  .20 

(3)  /S(Cabs  in  accidents  identified  as  green, blue)  £  [0,l] 

(.4)  !S(This  particular  cab  in  an  accident  identified  as  green, blue)  c  [0,l] 
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These  statistical  statements  mention  increasingly  specific  potential  reference 
sets.  The  relevant  proportions  in  Cl)  and  C2)  differ,  so  the  holder  of  this 
corpiLS  should  base  his  probability  on  C2}  rather  than  on  (l);  (3)  and  C^) 
concern  more  specific  reference  sets  yet,  but  they  provide  no  new  infonnation. 
According  to  the  niles  for  choosing  reference  classes  (Kyburg,  197^),  (2) 
provides  the  appropriate  basis  for  the  probability  that  the  particular 
cab  in  question  is  blue. 

It  is  also  worth  noting  that  this  kind  of  reconstruction  is  not  per¬ 
missible  in  the  lightbulb  example.  Corresponding  to  (3)  we  would  have: 

(3')  /UCselected  lightbulbs  testing  defective, non-defective)  [0,1] 

Given  the  conditions  of  the  problem,  that  the  lightbulbs  are  selected  at 
random,  it  follows  from  statements  corresponding  to  (1)  and  (2)  that 

(3")  deselected  lightbulbs  testing  defective, non-defective)  =  .59 

The  argument  leading  to  (3")  is  arithmetically  nontrivial  for  most  people. 

It  is  thus  not  surprising  that  the  subjects  didn’t  get  the  right  answer.  But 
I  would  conjecture  that  it  wotild  be  quite  easy,  with  pencil  and  paper,  to 
convince  the  subjects  that  .59  was  the  right  answer.  On  the  other  hand, 
it  is  not  so  easy  to  convince  all  subjects  that  .59  is  the  right  answer  in  the 
cab  problem;  L.  J.  Cohen,  for  example,  who  has  access  to  unlimited  supplies 
of  paper  and  pencils,  is  still  unconvinced  (1979,  1980,  1981 ). 

There  is  still  the  fact  to  be  explained  that  the  median  answer  in  the 
lightbulb  problem  is  0.80;  many  subjects  must  have  answered  the  lightbulb 
problem  exactly  on  the  lines  of  the  cab  problem.  A  hypothetical  explanation 
of  this  might  run  as  follows:  In  assessing  probabilities  in  ordinary  life, 
one  can  ordinarily  use  a  frequency  in  a  reference  set.  People  have  relatively 
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little  experience  in  combining  probabilities  in  accordance  with  Bayes' 
Theorem.  For  example,  it  seems  more  natural  in  giving  the  probability  of 
a  one  on  a  toss  of  a  die,  given  that  it  is  an  odd  number,  to  calculate 
that  a  third  of  the  odd  numbered  tosses  yield  a  one,  than  to  divide  a 
sixth  (ones)  by  a  half  (odd  numbers).  Perhaps  this  is  something  that 
could  be  got  at  by  a  cleverly  designed  experiment. 

Given  a  choice  of  conflicting  frequencies  in  two  possible  reference 

sets,  therefore,  it  is  usually  the  case  that  one  of  those  frequencies 

will  be  suitable  and  the  other  unsuitable  as  a  basis  for  a  probability. 

In  the  lightbulb  problem  the  correct  reference  set  is  not  one  of  the 

options  given:  the  subject  must  devise  the  correct  reference  set,  and 
relative 

compute  its  relevant /frequency,  on  his  own.  The  difficulty  that  subjects 
have  with  the  lightbiilb  problem,  therefore,  seems  to  stem  from  what  might 
be  called  the  natural  ecology  of  multiple  choice  questions.  Given  the 
matrix  of  relative  frequencies  as  part  of  the  data  in  the  lightbijlb 
problem,  how  would  subjects  do? 

There  are  several  other  biasses  in  intuitive  probabilistic  inference 
that  have  been  subjected  to  experimental  assessment.  Three  such  biasses 
are  discussed  in  a  well  known  paper  by  Tversky  and  Kahneman  (197^).  The 
representativeness  heuristic  leads  to  bias  in  the  assessment  of 
the  probability  that  an  A  is  a  B :  If  A  and  B_  are  very 
much  alike,  the  probability  of  a  (particular)  A  being  a  B  is  assessed  as 
higher  than  it  should  be;  if  A  and  B  are  very  dissimilar,  the  assessed 
probability  is  excessively  low.  The  representativeness  heuristic  leads 
to  the  neglect  of  prior  probabilities.  In  one  particular  experiment 
described  in  Tversky  and  Kahneman  (197^)  and  reported  in  Kahnemain  and 
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rversky  (1973),  the  subjects  were  told  the  appropriate  base  rates,  but 
neglected  them:  "  subjects  evaluated  the  likelihood  that  a  particular 

description  belonged  to  an  engineer  rather  than  to  a  lawyer  by  the  degree 
to  which  this  description  was  representative  of  the  two  stereotypes  with 
little  or  no  regard  for  the  prior  probabilities  of  the  categories"  (Johnson- 
Laird  smd  Wason  (1977),  p.  328). 

It  seems  clear  that  in  this  case  the  subjects  are  flatly  wrong.  It 
is  curious  that  when  a  description  was  given  that  could  fit  either  of  the 
categories  equally  well,  the  subjects  still  ignored  the  base  rates,  taking 
the  probabilities  to  be  .5  and  .5-  A  possible  explanation  of  this  is  that 
the  subjects  acted  as  if  the  base  rate  among  the  individuals  selected  to 
be  categorized  was  50%.  It  is  not  heord  to  imagine  subjects  beguiling 
themselves  as  follows:  The  individual  selected  is  either  a  lawyer  or  an 
engineer;  that's  one  of  each,  so  the  base  rate  among  those  selected  is  .50, 
and  the  only  relevaint  evidence  I  have  to  decide  between  the  two  alternatives 
consists  of  the  description. 

Even  more  blatant  violations  of  normative  statistical  theory  are  to 
be  found  when  representativeness  is  applied  to  frequencies,  either  in 
estimating  the  frequency  in  a  sample  from  a  known  population  or  in 
estimating  the  population  from  which  a  sample  of  known  frequency  has  been 
drawn  (Tversky  and  Kahneman,  197^).  Here  again,  the  subjects  are  simply 
in  error;  but  a  possible  explanation  is  at  hand.  Statistical  generalizations 
of  the  forms  "almost  all  A's  are  B^'s,  and  "Almost  no  A's  are  B's"  are  the 
sorts  of  generalizations  most  people  usually  find  most  useful  in 
their  dealings  with  the  real  approximate  world.  But  the  representativeness 
heuristic  does  not  work  badly  with  respect  to  generalizations  of  this  sort. 
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The  authors  offer  other  conjectures  as  to  why  people  fail  generally 
to  learn  from  experience  statistical  facts  concerning  regression,  or 
the  relation  between  sample  size  and  variability.  No  doubt  many  factors 
are  at  work.  But  it  should  not.be  concluded  —  as  one  might  conclude  if  the 
competence  of  ordinary  people  were  taken  as  a  standard  of  rational 
belief  —  that  because  few  people  take  account  of  regression  to  the  mean 
in  making  predictions  statistical  theoi^  does  not  provide  a  norm  of 
rationality. 

On  the  other  hand,  there  are  many  instances  of  intuitive  probabilistic 
inference  in  which  it  is  not  clear  how  to  apply  statistical  theory.  In 
another  paper  (Kahneman  and  Tversky  1979) »  the  same  authors  offer  some 
suggestions  for  improving  prediction.  They  emphasize  the  importance  of 
considering  "distributional"  information,  as  opposed  to  relying  too  heavily  on 
the  infonnation  embodied  in  the  unique  case  under  consideration.  "The 
analyst  should  therefore  make  every  effort  to  frame  the  forecasting 
problem  so  as  to  facilitate  the  utilization  of  all  the  distributional 
information  that  is  available  to  the  expert"  (typescript,  p.  5)-  Since 
the  authors  accept  a  subjectivistic  interpretation  of  probability,  they 
accept  that  individuals  may  assign  probabilities  to  penrticular  cases  that 
8u:e  not  based  on  any  form  of  statistical  knowledge.  Their  emphasis  on 
distributional  knowledge,  from  our  point  of  view,  reflects  a  logical  truism: 
There  can  be  no  (rational)  probabilities  without  underlying  frequencies. 

Of  course,  there  are  all  kinds  of  distributional  information.  The 
unique  case  under  consideration  can  be  seen  to  fall  in  a  great  many  classes 
about  which  we  have  statistical  information.  Thus  it  is  crucial  for 
practical  guidance,  as  well  as  for  the  chaoracterization  of  rationality. 
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to  be  able  to  sort  out  that  distributional  information.  In  the  author's 
simple  case  of  a  publisher  attempting  to  predict  the  sales 
of  a  book,  they  suggest  first  that  "the  selection  of  a  reference 
class  is  straightforward,"  (p.  8)  and,  later,  that  "For  example, 
the  reference  class  for  the  prediction  of  the  sales  of  a  book  could  consist 
of  other  books  by  the  same  author,  or  books  on  the  same  topic,  or  of  books 
of  the  same  general  type  . . .  the  most  inclusive  class  may  allow  for  the 
best  estimate  of  the  distribution  of  outcomes,  but  it  may  be  too  hetero¬ 
geneous  to  permit  a  meaningful  comparison  to  the  book  at  hand  . . .  the  class 
of  books  on  the  same  topic  could  be  the  most  appropriate"  (p.  9)* 

The  authors  give  no  normative  criteria  for  the  choice  of  a  reference  class 
but  it  is  clear  that  this  is  a  problem  that  is  on  their  minds.  On  the  view 
of  probability  and  rationality  being  developed  here,  it  is  clearly  crucial. 

It  is  a  matter  that  receives  considerable  formal  attention  in  the  full 
development  of  epistemological  probability,  but  it  would  take  us  too  far 
afield  to  consider  it  in  detail  here.  As  Einhorn  and  Hogarth  (198I,  p.  65) 
point  out,  "There  is  no  generally  accepted  normative  way  of  defining  the 
appropriate  population." 

8.  Decision  Under  Uncertainty. 

In  their  review  of  behavioral  decision  theory »  Einhorn  and  Hogarth 
(  1981)  draw  attention  to  a  number  of  apparent  conflicts  between  the  ordinary 
normative  theory  (subjective  expected  utility  theory)  and  the  behavior 
of  real  agents.  They  point  out  that  even  nonregressive  estimates  may 
txirn  out  to  be  more  profitable  than  regressive  ones  in  an  environment  that 
is  nonstationary.  "[T]he  optimal  prediction  is  conditional  on  which 
hypothesis  you  [the  experimenter]  hold."  All  of  the  oddities  of  intuitive 
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probabilistic  inference  are  reflected  in  the  differences  between  the 

recommendations  of  normative  choice  theory  and  the  description  of  the 

ways  in  which  people  choose.  But  there  are  other  differences  as  well. 

For  example,  the  choice  problem  may  be  stated  in  two  apparently  equivalent 
it  may 

ways,  and/lead  to  different  choices  xmder  each  statement.  Figure/groxind 
relations,  learning,  attention,  etc.,  all  play  a  role,  in  addition  to  the 
role  played  by  utility  and  probability,  in  the  explanation  of  human  choice. 
As  the  (.authors  say  (p.  75),  ”  the  descriptive  adequacy  of  E(U)  [expected 

utility  theory]  has  been  challenged  repeatedly. 

Furthermore,  the  normative  adequacy  of  MU)  has  itself  begun  to  be 
challenged,  for  example,  by  prospect  theory.  A  particularly  telling 
challenge  on  an  intuitive  level  is  provided  by  Lopes  (1981),  She  discusses 
the  traditional  St.  Petersburg  paradox,  and  a  piece  of  anecdotal  evidence 
reported  by  Samuelson  (1963). 

The  St.  Petersburg  paradox  goes  like  this:  A  fair  coin  is  tossed 
until  heads  first  appears  —  say  on  the  nth  toss.  The  player  then 
receives  a  prize  of  2—  dollars  —  or,  to  avoid  questions  of  the  utility  of 
money,  2^  utiles.  What  is  the  fair  price  for  the  player  to  pay  for  the 
privilege  of  playing  the  game  once?  The  answer  —  the  expected  value  of 
the  game  —  turns  out  to  be  infinite.  Most  people  would  pay  relatively 
little.  Is  this  irrational? 

Lopes  turns  the  problem  around:  If  someone  offers  to  run  the  game 
for  a  number  of  players,  only  charging  k  dollars,  is  he  necessarily 
irrational?  Using  dollar  units,  a  number  of  Monte  Carlo  simulations^ 
were  run  in  which  100  businesses,  starting  with  a  capital  of  $10,000, 
sell  opportunities  to  play  the  Petersberg  game  at  prices  of  $25,  $50, 
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and  $100.  After  a  million  customers,  the  prospects  of  the  businesses 
selling  chances  for  $100  are  not  bad:  only  10/S  of  them  had  gone  broke, 
euad  the  mean  and  median  outcomes  were  56  and  79  million  dollars  respectively. 
Not  bad  for  a  business  with  only-  a  $10,000  start-up  cost.  The  only  problem 
is  finding  enoiigh  customers.  It  is  not  correct  to  say  (as  Lopes  does, 
p.  378)  that  if  it  is  a  good  business  for  the  businessman,  it  cannot  be 
a  good  one  for  the  customers  —  most  successful  businesses  survive  because 
both  the  businessman  and  the  customer  increase  their  utilities  in  the 
exchange  —  but  it  does  seem  unlikely  that  the  Petersberg  business  can 
compete  successfully  with  the  State  Lottery. 

Within  the  framework  at  hand,  the  analysis  of  the  game  is  quite 
straightforward.  Suppose  the  index  of  practical  certainty  is  .001.  In 
the  evidential  coi*pus  it  is  assumed  known  that  the  coin  is  fair;  it  is 
thus  practically  certain  that  the  game  will  last  no  more  than  ten  tosses, 
and  its  practical  expected  value  will  be  $10.00.  The  entrepreneurs  will 
have  a  hard  time  selling  chances  for  $100.00. 

How  does  this  differ  from  the  State  Lottery?  One  should  be  practically 
certain  that  one  is  not  going  to  win  the  lottery;  is  one  therefore  irrational 
in  buying  a  ticket?  As  is  well  known,  there  are  circ\amstances  in  which  it 
is  not  irrational.  They  depend  on  the  relation  between  utility  and  money, 
and  the  fact  that  money  is  not  of  monotonically  increasing  utility.  A 
dollar  isn't  worth  much  —  what  can  you  do  with  a  dollar?  —  but  having 
a  fortune  would  be  very  nice.  In  buying  a  lottery  ticket,  your  practical 
expected  value  should  be  0  —  you  should  be  practically  certain  that  you 
won't  win.  On  the  other  hand,  the  marginal  disutility  of  parting  with  a 
single  dollar  can  also  be  0,  so  in  terms  of  practical  expected  utility  the 
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exchange  is  fair  for  you.  If  you  count  into  the  practical  expected  utility 
the  opporttuaity  to  daydream  about  winning,  the  exchange  is  better  than  fair. 

Another  situation  discussed  by  Lopes  comes  from  Samuelson  (I963). 
Samuelson  tells  of  offering  to  bet  some  colleagues  $100  to  $200  that  a 
specified  side  of  a  coin  would  not  appear  on  its  first  toss.  None  of 
the  colleagues  took  him  up.  One  person  argued  as  follows  (quoted  from 
Lopes,  p.  382): 

I  won't  bet  because  I  would  feel  the  $100  loss  more  than  the  $200 
gain.  But  I'll  take  you  on  if  you  promise  to  let  me  make  100 
such  bets  . . .  One  toss  is  not  enough  to  make  it  reasonably  sure 
that  the  law  of  averages  will  turn  out  in  my  favor.  But  in  a 
hundred  tosses  of  a  coin,  the  law  of  large  numbers  will  make  it  a 
dairn  good  bet.  I  am,  so  to  speak,  virtually  sure  to  come  out 
ahead  in  such  a  sequence... 

Samuelson  finds  this  response  irrational.  Lopes  sides  with  the 
colleague.  The  colleague  feels,  correctly,  that  he  can  be  practically 
certain  that  in  a  series  of  a  hundred  gambles,  he  will  come  out  ahead. 

It  is  true  that  he  may  not;  he  may  lose  $10,000.  But  the  probability  of 
this  is  very  small  —  lower  than  the  probability  that  on  his  next  air 
trip  he  will  be  killed.  A  practical  man  does  not  take  such  possibilities 
as  real.  (Then  why  buy  air  insurance?  The  explanation  is  roughly  the  same 
as  that  for  buying  a  lottery  ticket.  You'll  hardly  miss  a  couple  of 
dollars,  and  altho\igh  your  practical  expectation  is  0  or  slightly  negative 
in  dollars,  you  obtain  the  added  utility  of  peace  of  mind.) 

Lester  Dubins  and  L.  J.  Savage  (1965)  provide  a  subjectivistic  account 
of  a  structurally  similar  situation.  Suppose  you  have  $1000  and  absolutely 
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have  to  have  $10,000  by  the  next  day.  You  are  in  a  casino,  and  gambling 
is  your  only  hope.  How  should  you  do  it?  The  answer  is  clearly  that 
you  should  atake  the  whole  $1000  on  a  single  10:1  '■'.'jt.  Ine  more  you 
divide  your  stake,  the  more  probable  it  is  that  the  house  odds  will  get  you. 

An  attempt  to  make  sense  not  only  of  the  ch''i_=b  that  people 
actually  make  when  faced  with  ijncertainty  but  of  relatively  clear  and 
compelling  intuitions  concerning  such  choices,  is  prospect  theory  (Tversky 
and  Kahneman  1981  ) .  It  is  not  clear  to  me  exactly  how  well  it  accords 
with  the  views  presented  here,  but  there  are  clearly  certain  similarities. 

In  this  theory  decision  weights  correspond  to  "subjective  probabilities," 
but  they  are  different  in  a  number  of  respects .  They  do  not  sum  to  1 
(subcertainty),  just  as  the  relevant  probabilities  in  the  St.  Petersberg 
problem  do  not  Siam  to  1.  "The  function  is  not  well  behaved  near-  the 
endpoints"  (p.  On  the  present  account,  probabilities  greater  than 

£,  the  level  of  practical  certainty,  are  treated  as  1,  and  probabilities 
less  than  l-£  are  treated  as  0.  The  asymmetry  between  large  and  small 
probabilities  in  prospect  theory  is  not  reflected  here.  But  it  may  be 
that  this  apparent  asymmetry  is  more  fruitfully  taken  account  of  in  the 
valuation  function,  about  which  I  have  nothing  to  say  here. 

9.  Conclusion: 

These  reflections  suggest  a  three-fold  connection  between  the  philo¬ 
sophical  normative  investigation  of  rationality,  and  the  empirical  psycho¬ 
logical  study  of  belief. 

The  first  connection  is  that  empirical  studies  may  suggest  certain 
facts  relevant  to  the  development  of  normative  constraints.  The  normative 
constraints  must  be  appropriate  to  the  kinds  of  beings  we  are.  This  is 
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not  to  say  that  we  must  automat icaLIly  embody  them,  or  even  that  we  must 
he  able  to  achieve  them  (a  coimsel  of  perfection),  but  that  we  must  be 
capable  of  approaching  them,  of  lesurning  to  do  better.  At  the  same  time, 
most  empirical  studies  take  for  granted  certain  normative  constraints,  (in 
studying  degrees  of  belief,  we  assime  that  people's  bodies  -of  knowledge  axe 
consistent,  for  example.)  These  presupposed  constraints  may  or  may  not  be 
appropriate;  if  they  are  inappropriate,  they  may  vitiate  the  results  of 
the  psychological  investigation. 

The  second  connection  is  that  philosophicsil  investigations  into 
rationality  may  provide  a  useful  framework  within  which  psychological 
investigation  can  be  conducted.  For  example,  a  structure  which  allows  for 
some  kind  of  probabilistic  accepteuace  may  prove  useful  for  exploring  the 
oddities  of  choice  xonder  uncertainty  when  that  uncertainty  io  reflected 
by  chances  close  to  0  or  close  to  1. 

The  third  and  most  obvious  connection  is  provided  by  the  fact  that 
whatever  we  wish  to  conclude  about  the  rationality  with  which  people  draw 
conclusions  or  apportion  their  beliefs,  we  want  our  own  conclusions  to  be 
rational,  to  be  well  supported  by  the  evidence.  Even  an  argument  to  the 
effect  that  (other)  people’s  beliefs  do  not  conform  to  so\ind  inductive 
canons  ought  itself  to  be  based  on  sound  inductive  principles. 

To  sum  up:  I  take  a  theory  of  rational  belief  to  be  a  normative  theory. 
I  take  its  object  to  be  the  improvement  of  our  understanding.  While  such 
a  theory  cannot  be  made  up  out  of  whole  cloth  —  it  must  be  appropriate  to 
the  beings  whose  understanding  we  are  trying  to  improve  —  neither  should 
it  merely  reflect  what  people  actually  do.  Intuition  and  introspection  and 
empirical  investigation  may  reveal  general  principles  in  simple  and  concrete 
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cases.  Analysis  and  argument  may  reveal  connections  among  these  principles, 
or  defects  in  them  as  applied  to  more  complicated  cases,  or  limitations 
in  their  scope. 

What  I  have  tried  to  do  here  is  to  illustrate  this  process  by  showing 
the  way  in  which  my  approach  to  probability  sind  inductive  acceptance  throws 
light  on  several  things: 

(a)  The  deductive  structure  of  the  set  of  rationally  accepted  statements 
(assuming  there  is  one),  thus  providing  a  connection  between  deductive 
cogency  and  rational  belief  that  is  lacking  (or  only  implicit)  in  logic 
itself. 

(b)  The  addition  and  deletion  of  statements  to  and  from  a  body  of  accepted 
statements ;  this  is  a  matter  that  involves  probability,  but  it  is  one  that 
pure  Bayesian  conditionalization  can  throw  no  light  on. 

(c)  Degrees  of  belief  in  statements  that  are  not  accepted.  I  siiggest 
that  the  constraints  are  more  extensive  than  Bayesians  often  suppose 
(requiring  conformity  to  statistical  knowledge ) ,  but  less  precise  (being 
represented  by  intervals,  rather  than  real  numbers). 

(d)  A  distinction  between  conditional  degrees  of  belief  (reflected  in 
betting  ratios  on  conditional  bets)  and  degrees  of  belief  conditionail 
on  the  acceptance  of  new  data. 

These  matters  constitute  a  single  tangled  web  —  there  is  no  way  in 
which  we  'jan  approach  them  piecemeal.  And  since  the  web  is  so  tangled, 
we  have  an  explanation  both  of  the  inconclusiveness  of  psychological 
experiments  designed  to  explore  rationality  of  belief,  and  of  the  frustration 
that  has  led  some  philosophers  to  despair  of  finding  a  ''broad  reflective 
equilibrium"  of  rationality  (Cohen  198I  ).  But  though  I  think  it  is  clear 
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that  the  problem  of  characterizing  rationality  is  a  difficult  one  —  far- 
more  difficult  than  many  have  read-ized  —  it  does  not  seem  insuperable. 

The  very  difficulties  we  iincover  contribute  to  our  understanding.  And  the 
fact  that  we  progress  at  all,  the  fact  that  we  listen  to  each  other's 
arguments,  and  recognize  an  obligation  to  deal  with  them,  suggests  that 
our  goal  can  be  approached. 
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FOOTNOEES 

1.  Some  remarks  on  notation  may  be  helpful.  I  use  capital  letters 
P,  Rf  etc.,  to  stand  for  declarative  sentences  In  the  object  language: 
"Fldo  Is  In  the  manger,"  "There  Is  a  black  dog  In  the  manger,"  "All 
the  dogs  on  the  farm  are  In  the  manger,"  "All  crovs  are  black,"  "Between 
505^  and  70^  of  the  successful  conceptions  yield  brown  offspring."  The 
capital  letter  S  I  reserve  for  the  set  of  sentences  that  constitutes  the 
backgroimd  knowledge,  or  the  body  of  reasonably  accepted  beliefs,  or 
the  rational  corpus,  of  the  agent.  Lower  case  letters  etc., 

are  metalinguistic  variables,  representing  terms  of  the  object  language: 
These  may  be  names  of  Individuals  ("Fldo"),  sets  ("the  set  of  crows"), 
sets  of  sets  ("the  set  of  subsets  of  the  set  of  crows"),  etc.  Relative 
logical  type,  which  Is  all  we  need  be  concerned  with,  is  given  by  context. 
For  example,  "x  =  and  "x  c  tell  you  that  x  and  jr  are  of  the  same 
logical  type;  "x  €  tells  you  that  whatever  type  x  may  be,  ^  Is  of 

the  type  of  sets  that  have  as  members  objects  of  type  x*  This  flexibility 
is  essential  If  we  want  to  handle  both  the  probability  that  the  next  counter 
is  black,  and  the  probability  that  the  set  of  counters  we  have  examined 
is  representative  of  the  proportion  of  black  counters  in  the  bag.  The  lower 
case  letters,  £,£/£>  etc.,  are  used  both  as  metalinguistic  variables 
taking  as  values  real  number  designators  In  some  canonical  form  (binary, 
decimal)  and  also  the  real  numbers  between  0  and  1  so  designated.  Thus 
[£,^]  may  represent  the  expression  "[.6, .7]"  in  the  object  language,  and 
may  also  represent.  In  our  metalanguage,  the  closed  Interval  of  real  numbers 

[ .6, .7] • 
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2.  It  follovs  froo  the  principle  of  eplstemlc  condltloaallzatlon  that 

If  S  is  relevant  to  T,  then  T  is  relevant  to  S,  where  S  is  relevant  to  T 

Just  in  case  the  degree  of  belief  in  T,  given  S,  Bg(T),  is  different 

froa  the  unconditional  belief  in  ^  •  Suppose  that  S  is  the  statenent 

that  in  the  long  run  about  6o^  of  the  tosses  of  this  coin  land  heads, 

and  T  is  the  statenent  that  the  next  toss  of  this  coin  lands  heads.  Take 

the  degree  of  belief  in  ^  relative  to  our  ordinary  knowledge  of  coins, 

to  be  0.5.  S  is  relevant  to  T: 

=  0 .6  =  B(S  ^  T)  ^  ^ 

B(S) 

But  it  is  surely  stretching  this  to  say  that  T  is  relevant  to  S.  That  a 
coin  has  been  tossed  and  landed  heads  shouldn't  change  the  probability  of 
the  statistical  statement: 

Bj^(S)  »  B(S) 

3*  Or  too  strong,  if  there  are  no  non-probablllstlc  dispositions  to  act. 

4.  I  believe  so  many  things  to  be  true  that  I  am  almost  certain  that 
at  least  one  of  than  must  be  false. 

5.  This  has  been  argued  at  length  in  I^urg  (1961},  Kyburg  (1974),  and 
a  number  of  papers,  many  of  which  are  collected  in  Kyburg  (1983a). 

6.  Monte  Carlo  simulations  Involve  programming  a  computer  to  undertake 
a  vast  number  of  trials  embodying  computer-selected  random  numbers  to 
detezmine  the  outcomes  of  the  trials.  The  coinputer  simulation  thus 
can  reflect  the  long  run  outcome  of  a  stochastic  process. 
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